





























MATHEMATICS 
OF STATISTICS 

PART ONE 


BY 

JOHN F. KENNEY 

University of Wisconsin 


FOURTH PRINTING 



LONDON 

CHAPMAN & HALL, LTD. 

11 Henrietta Street, W.C. 2 



Copyright, 1939, by 

D. VAN NOSTRAND COMPANY, Inc. 
All Rights Reserved 

This hook or any part thereof may not 
be reproduced in any form without 
written permission from the publishers. 

First Published May 1939 

Reprinted January 1941 , February 1942 
July 1943 


Printed in United States of America 
Produced by 

TECHNICAL COMPOSITION CO. 
BOSTON 


Aiatri Afeae 


in Si^num 
Gratitudinis 


In the manufacture of this book, the publishers 
have observed the recommendations of the War 
Production Board and any variation from previ¬ 
ous printings of the same book is the result of 
this effort to conserve paper and other critical 
materials as an aid to the war effort. 




PREFACE 


The field of statistics is many sided and ranges over different levels. 
However, between the levels of clerical work at one extreme and 
mathematical research at the other extreme, there is a well-defined 
methodology, mathematical in nature, which underlies the specialized 
applications in the departments of economics, psychology, education, 
and biology. 

This book is an elementary text dealing with the mathematics of 
statistics. Fortunately, a considerable part of the descriptive meth¬ 
odology of statistics can be understood by those having relatively 
little knowledge of college mathematics. Although no mathematics 
beyond the ordinary Freshman course in college algebra is required 
for a profitable reading of this text, a certain degree of mathematical 
maturity and intelligence is presupposed. To achieve the maximum 
success perhaps only the best of those students whose mathematical 
preparation is limited to the minimum prerequisite should be encour¬ 
aged to study it. Occasionally, material is introduced to sharpen 
the interest and challenge the ability of the more advanced student 
without interrupting the main developments or discouraging those 
less mature. 

In writing this book, considerable selection of material necessarily 
had to be made. The omission of certain topics will be noted in the 
table of contents. Judging from my own experience, and that of 
others, the theory of sampling cannot be taught satisfactorily at the 
level for which Part I is intended. At best only a superficial use of 
formulas could be hoped for. Consequently, I have elected to defer 
this subject to Part II where a systematic treatment can be given. 
With regard to time series analysis, Professor J. Neyman says in his 
Lectures And Conferences On Mathematical Statistics (p. 106), 

We start by trying to split each of the series into several parts, which we 
arbitrarily assume to be additive. One of these parts is the trend, which we 
estimate perhaps by fitting a low order parabola to the whole series available. 
The next part is the “ business cycle.” The third part is the “ seasonal varia¬ 
tion,” which we frequently estimate by calculating moving averages. Finally, 
the remainder is considered to arise from random causes, and we concentrate 
on the question whether such a remainder in one of the variables is correlated 



vi 


Preface 


with that in some other. All this procedure seems to me very artificial and 
arbitrary. ... In my opinion the whole problem of time series must be treated 
from a point of view that is quite different from the traditional one just described. 

I concur in this opinion and I believe that no useful purpose would 
be served by drilling students in the traditional procedures. 

Throughout the book the student is encouraged and stimulated to 
master fundamental principles and concepts. Essentially, the job 
of every statistician is to take hold of situations and disentangle 
them by the techniques of the science. Therefore, considerable 
emphasis is placed on technique. I have tried to develop in the 
student the ability to use symbolism creatively as a lan gu ag e. 
Numerous examples are given to clarify concepts and illustrate 
processes. Over two hundred exercises are included. It is intended 
that these exercises should be handled as in a mathematics course. 
No laboratory, so-called, is necessary. 

Nowadays, no little importance is attached to motivation. I have 
constantly held in mind the necessity of making the subject interest¬ 
ing and stimulating to the beginning student. Nevertheless, I ven¬ 
ture the opinion that the best motivation for intelligent students is 
the feeling that their teacher knows his subject. 

In preparing the manuscript a large number of books and papers 
have been examined and perhaps leaned upon. No claim to origi¬ 
nality is made except possibly in the matter of arrangement and 
pedagogical approach. Numerous references to the scholarly achieve¬ 
ments of others are cited. It is hoped that the serious student will 
read some of these and thereby widen his perspective and enhance 
his interest. 

In conclusion, I wish to express my deep appreciation to Professor 
Allen T. Craig and Dr. Mason E. Wescott who critically read the 
manuscript and made many suggestions for its improvement. 


Evanston, Illinois. 
April, 1939 


John F. Kenney 




CONTENTS 

INTRODUCTION 

SECTION PAGE 

1. Definition. 1 

2. Scope.. 1 

3. Statistical Methods in the Social Sciences. 2 

4. Mathematics and Statistics. 3 

5. Problem Assignments. 4 

6. Calculating Machines... 4 

7. Collateral Reading. 5 

CHAPTER I 

FREQUENCY DISTRIBUTIONS 

1. Variates. 7 

2. Accuracy of Measurements. 7 

3. Necessity for Classification. 8 

4. Tabulation. 9 

5. Frequency Distribution. 9 

6. Class Intervals. 12 

7. Distinction between Class Limits and Class Boundaries. 14 

8. Rules for Making a Frequency Distribution. 15 

9. Cumulative Frequencies. 15 

10. Additional Distributions. 17 

CHAPTER II 

GRAPHICAL REPRESENTATION 

1. The Function Concept. 21 

2. Charts. 23 

3. The Frequency Polygon. 23 

4. Histogram. 23 

5. Frequency Curves. 24 

6. Ogives. 26 

7. Relation of Cum f to Areas . 27 

CHAPTER III 
AVERAGES 

1. Introductory. 28 

2. Notation. 28 

3. Arithmetic Mean. 32 

4. Weighted Arithmetic Mean. 32 

vii 































Contents 


VU1 

SECTION PAGE 

5. Arithmetic Mean from Frequency Table. . . .. 34 

6 . Translation of Axes; Deviations. 35 

7. Properties of $. 36 

8. Short Methods of Computing £. 38 

9. Geometric Explanation. 39 

10. Mean of Means. 41 

11. The Mode. 45 

12. The Median. 46 

13. Median of a Frequency Distribution. 46 

14. Graphical Interpretation of Mean, Median, and Mode. 49 

15. Discussion. 49 

16. Geometric Mean. . 51 

17. Harmonic Mean. 53 

CHAPTER IV 
MOMENTS 

1 . Moments about an Arbitrary Origin. 60 

2 . Moments in Units of the Class Interval. 62 

3. Moments about the Mean. 62 

4. Relations between the m's and the y’s. 53 

5. Standard Deviation. 66 

6 . Standard Units. 67 

7. Moments in Standard Units. 69 

8 . Use of of 3 and .. 71 

9. Summary... 72 

10 . Sheppard’s Corrections. 75 

CHAPTER V 

MEASURES OF DISPERSION 

1. Introduction. 78 

2 . The Quartile Deviation. 79 

3. Mean Deviation... 81 

4. The Standard Deviation. 83 

5. Relative Dispersions. 86 

6 . Scaling a Distribution in Terms of o-.... .. 86 

7. Semi-Interquartile Range in Terms of a . 88 

8 . N Small, Ungrouped Data. 89 

9. Standard Deviation of Combinations of Sets. 94 

10 . Graphical Representation. 98 

CHAPTER VI 

TYPES OF DISTRIBUTIONS. THE NORMAL CURVE 

1 . Skewness and Kurtosis. 104 

2 . Frequency Curves. 107 

3. The Normal Curve. 109 








































Contents 


IX 


SECTION PAGE 

4. Standard Form. 109 

5. Tables of Standard Ordinates and Areas. 110 

6. Properties. 112 

7. Curve Fitting. 118 

8. Graduation. 122 

9. Purpose of a Graduation. 124 

10. Probability Paper. 125 

CHAPTER VII 
CURVE FITTING 

1. Empirical Expressions. 130 

2. Linear Functions. 131 

3. Quadratic Function. 133 

4. Fitting a Straight Line. 134 

5. Graphically. 135 

6. Method of Moments. 135 

7. An Alternative Procedure. 138 

8. Least Squares... 139 

9. Simplification. 142 

10. Time Series. 143 

11. Exponential Trends. 145 

12. Ratio Charts. 149 

13. Further Remarks on the Exponential Function.. 151 

14. Parabolic Trend.. 152 

15. The Gompertz Curve. 154 

16. Remarks and References..... 156 

CHAPTER VIII 
CORRELATION THEORY 

1. The Meaning of Simple Correlation. 159 

2. The Coefficient of Correlation.. 160 

3. Other Formulas for r . 163 

4. Regression. 165 

5. The Standard Error of Estimate.. 168 

6. Properties of the Correlation Coefficient and Standard Error. 168 

7. Further Discussion. 172 

8. Coefficient of Alienation.... 173 

9. Correlation Table.... 177 

10. Notation... 178 

11. Means and Variances... 180 

12. Computation of Means. 181 

13. Computation of r . 183 

14. Sign of r. Grouping Errors. 186 

15. Regression Lines for a Correlation Table. 187 

16. Applications.. 189 









































X 


Contents 


SECTION 

17. S y for a Correlation Table. 

18. Normal Correlation Surface. 

19. Properties of Normal Bivariate Surface. 

20. Reliability of Predictions. 

21. Non-Linear Regression. Correlation Ratio. 

22. Computation of rf . 

23. Further Discussion, Test for Linearity of Regression 

24. Correlation from Ranks. 

25. Comparison of p with r. 

26. Interpretation. Common Elements. 


PAGE 

192 

193 

195 

196 
198 
201 
204 
209 
212 
213 


Review Questions and Problems 


216 


Tables 


223 


Index 


247 
















MATHEMATICS OF STATISTICS 


INTRODUCTION 

1. Definition. The word statistics is used in at least two different 
senses. Construed as plural it refers to the systematic presentation 
of quantitative data. Used in a singular sense, the word statistics 
refers to the science which has for its object the classification and 
analysis of quantitative data so that intelligent judgments may be 
passed upon them. 

It is usually clear from the context which meaning 1 is intended, 
although some persons prefer the expression statistical methods 
for this second meaning. Statistical methods are all those devices 
used in the collection and analysis of data. The theory of statis¬ 
tics is the exposition of statistical methods and is of a mathematical 
nature. 

2. Scope. There is a rather widespread misapprehension that 
statistics is a branch of economics. As a matter of fact, statistical 
problems arise in many different fields — biology, economics, insur¬ 
ance, education, physics, and astronomy, as well as various branches 
of business. The exploration of certain aspects of nearly every field 
involves some phase of statistical theory. Indeed, certain types of 
statistical methodology may have almost unexpected applications 
the discovery, for example, that the lives of physical plants are 
governed by much the same statistical rules as the lives of human 
beings, and hence, that life tables may be applied to both. Also, 
physicists are discovering that many of the problems in the mod¬ 
em theory of the structure of the atom are essentially statistical in 
nature. 

Statistics as a science is making contributions to all the sciences. 
On the other hand, some sciences like biometry and physics have 
contributed much in the development of statistics and its terminology. 

i In addition to the two meanings given above, another has crept into the 
recent literature where reference is made to a statistic. This term will be ex¬ 
plained later. 


1 



2 Mathematics of Statistics 

The following quotation from Science may appropriately be men¬ 
tioned here: 

The extension of the scope of quantitative methods through the medium of 
statistical analysis is one of the most significant things going on in the scientific 
world at the present time. 1 

The importance of statistical method in present-day thinking has 
been well stated, as follows: 

More and more the modem temper relies upon statistical method in its at¬ 
tempts to understand and to chart the workings of the world in which we live. 
Particularly in those sciences which deal with human beings, whether in their 
physical and biological aspects or in their social, economic, and psychological 
relations, the spirit of our time asks that its conclusions be based not so much 
upon the distinctive reactions of one or two individuals as upon the observation 
of large numbers of individuals, the measurement of their common likenesses and 
the extent of their diversity. As the data thus gathered from mass phenomena 
become extensive, it becomes imperative to have methods of organization to 
bring the facts within the compass of our understanding, methods of analysis 
to make the essential relations appear out of the mass of detail in which they 
are hidden, and methods of classification and description to facilitate the pres¬ 
entation of the data for the study and consideration of other persons. Thus 
statistical method becomes a telescope through which we can study a larger 
terrain than would be accessible to our unaided vision. 2 

3. Statistical Methods in the Social Sciences. Because statistics 
is fundamentally the study of aggregates of individuals, rather than 
of individuals, whether these individuals be observations or measure¬ 
ments or persons, it is apparent that statistical methods are essential 
to social studies. Indeed it has been said that it is principally by the 
aid of such methods that these studies may be raised to the rank of 
sciences. 

This particular dependence of social studies upon statistical methods 
is mentioned in a recent book 3 from which we quote the following: 

If, as seems probable, our present uncoordinated large-scale business is to be 
further developed into an efficiently managed instrument of production serving 
the needs of the people, then statistics, together with mathematical economics, 
will emerge among the most important tools of the social sciences. For it is by 
means of averages, dispersions, coefficients of variability, trends, and regressions, 
as pictured in control charts, that management is able to visualize and direct the 
movements of large masses of population. 

1 Science. January 18, 1929. 

2 Mathematics and Statistics — Walker. Sixth Yearbook, National Council of 
Teachers of Mathematics. 

8 Reprinted by permission from Methods of Statistical Analysis by Davies and 
Crowder, published by John Wiley and Sons, Inc. 





Introduction 


3 


The work of the statistician is much like that of the map maker who presents 
the traveler with a sketch of important highways, showing the locations of towns 
and geographical features. The map is not a picture of reality. It shows cities 
as dots, and rivers as lines. It has purposely omitted the interesting details of 
scenery and the still more important features of human interest which lie along 
the route and which constitute the traveler’s real objectives. Nevertheless, as 
a means of reaching these objectives, the map is extremely useful. And so it is 
with statistics in the hands of the business executive and statesman. Back of the 
charts are human beings with their varying characteristics and vital interests, 
few of which can be described in figures. Yet as a means of serving these interests', 
of keeping trade moving from one region to another, of allocating investment and 
labor, and of apportioning relief to maladjusted industries and dependent classes, 
statistics and mathematical methods are important, and are becoming increas¬ 
ingly important with the growing complexity of society. 

It may be said that the study of statistics is not merely an attempt to de¬ 
scribe what actually occurs, though it must begin at this point, but in its broader 
aspects it is the logical background of business and social management. Hence 
what appears now to be mere abstraction may later become the basic necessity 
of an applied science. Eventually, it may be assumed, the social arts of business 
and politics will rest upon as substantial a theoretical and mathematical back¬ 
ground as physics, chemistry, and engineering. 

4* Mathematics and Statistics. Statistical problems are of inter¬ 
est, therefore, not only to the worker in the particular field but also 
to the mathematician, inasmuch as methods adequate to the treat¬ 
ment of these problems can best be presented in the precise and 
accurate language of mathematics. Moreover, statistical methods 
are grounded in statistical theory which is a branch of applied mathe¬ 
matics. 

Although it is true that some statistical problems are ultimately 
problems in advanced mathematics, many of which mathematicians 
have not yet been able to solve, nevertheless a large and interesting 
part of statistical analysis requires mathematics no more advanced 
than elementary algebra. 

It has been said that sooner or later every true science tends to 
become mathematical. The notation of mathematics is simply a 
language and it is not limited to any particular field of knowledge. 
The following quotations are inserted to help the student approach 
the study of statistics in the proper spirit. 

1 . Mathematics, the science of the ideal, becomes the means of investigating, 
understanding, and making known the world of the real. — White. 

2. Probably among all the pursuits of the university, mathematics preemi¬ 
nently demands self-denial, patience, and perseverance. ... — Todhunter. 

3. From time immemorial, there has been but one way to become a mathe¬ 
matician and there will never be another: it is a way interior to the subject and 



4 


Mathematics of Statistics 


involves years of assiduous toil. Short-cuts to mathematical scholarship there 
are none, whether the seeker be a philosopher or a king. — Keyser. 

4. Will is the creative force. Without the will to learn there is no learning. 
And when the will is feeble and confused, learning lags. — Mursell. 

5. The theory of statistics is not easy, not so much because it is abstruse, as 

because the ideas are new to most people, and a good deal of hard thinking and 
patient work will be necessary. . . . Statistical w r ork always involves a lot of 
computing [and] there is no better way of learning statistics than by working 
through examples. — Tippett. t 

5. Problem Assignments. The student should realize at the out¬ 
set that statistical methods are not substitutes for thinking but are 
aids and supplements to it. A superficial knowledge of statistical 
technique cannot take the place of good judgment. Mere ability to 
substitute in formulas should not be confused with genuine statistical 
sophistication and insight. To the serious and capable student who 
intends to master this course, formulas will be a set of functioning 
concepts and tools rather than machines into which material may be 
fed to grind out a meaningless answer. 

Throughout the book exercises are inserted to give the student an 
opportunity to test his knowledge of the theory and methodology, 
and to develop his power of analysis. In grading the solutions, value 
will be attached to accuracy, thoroughness, neatness, and systematic 
arrangement of the work. 

6. Calculating Machines. A full description of the parts of a cal¬ 
culating machine and their operation may be obtained from an 7n- 
struction Book which is furnished by the manufacturer, so only a 
brief description will be given here. 

A calculating machine is constructed to add and subtract. By 
means of continued addition or subtraction, operations involving 
multiplication, division, and square root can also be performed with 
great speed. 

In addition to a keyboard on which numbers can be punched, most 
machines have a sliding carriage, carrying two dials one above the 
other. In finding a product nx> one of the factors n is punched on 
the keyboard and as the motive crank at the side is turned (either by 
hand or by electric motor), the other factor x appears on the upper 
dial. The product nx is then read from the lower dial. 

An important property of the modern calculating machine is its 
adaptability to short cuts and combinations of operations. For 
example, one may multiply two numbers nx together and add the 
result to a third number k without tabulating the intermediate steps. 




Introduction 


5 


This is accomplished by first registering the number k on the lower 
dial and then proceeding as in finding the product nx . The result 
nx + k is then read from the lower dial. An extension of this pro¬ 
cedure is especially useful in a series of computations where k and n 
are constant and various values are assigned to x . To describe the 
procedure, suppose it is required to calculate the successive values of 
12 + 6# for x = 5, 7, 15, 12, etc. The number k = 12 is first regis¬ 
tered on the lower dial, then the factor n = 6 is placed on the key¬ 
board, and by turning the crank forward five times to make the first 
value of x = 5 appear on the upper dial, the result 12 + 6 X 5 ap¬ 
pears on the lower dial. Instead of clearing the dial, the crank is now 
turned forward twice more to rebuild the value x — 5 into x = 7, 
and the result 12 + 6 X 7 can be read from the lower dial. In re¬ 
building x = 15 into x = 12 the crank is turned backwards. This 
procedure can be repeated until all the required values of 12 + 6x 
have been calculated. A process of this sort is called the continuous 
method of calculating. 

In most of the exercises in this course, the computations are not 
laborious and calculating machines are not required. However, if 
machines are available they may be used to advantage in Chapters 
IV and VI. The student who desires to develop skill on a calculat¬ 
ing machine should begin now to study an Instruction Book and prac¬ 
tice the fundamental operations explained there. 

7. Collateral Reading. Perhaps no single textbook can meet all 
the needs of all students of statistics. There are several good books 
on elementary statistics which, although not fundamentally different, 
present different points of view on certain topics and treat them with 
varying degrees of emphasis depending upon the field of major inter¬ 
est. At least some of the books listed below should be readily avail¬ 
able on the reserve shelf of the library. The list should be useful to 
those who wish to study more fully certain details in which they may 
be interested. 

1. Baten — Mathematical Statistics. John Wiley and Sons, Inc. 

2 . Bivins —The Ratio Chart in Business . Codex Book Co. 

3. Burgess — The Mathematics of Statistics. Houghton, Mifflin and Co. 

4. Camp — The Mathematical Part of Elementary Statistics. D. C. Heath 
and Co. 

5. Davis and Nelson — Elements of Statistics . Principia Press. 

6 . Haskell — Graphic Charts in Business. Codex Book Co. 

7. Garrett — Statistics in Psychology and Education . Longmans, Green 
and Co. 


6 


Mathematics of Statistics 


8 . Glover— Tables of Applied Mathematics. Wahr. 

9. Mills — Statistical Methods , Revised. Henry Holt and Co. 

10. Pearl — Medical Biometry and Statistics. W. B. Saunders and Co. 

11. Philips —The Principles of Financial and Statistical Mathematics. 
Prentice-Hall, Inc. 

12 . Richardson — An Introduction to Statistical Analysis. Hareourt, Brace 
and Co. 

13. Snedecor — Statistical Methods. Collegiate Press, Inc., Ames, Iowa. 

14. Walker — Mathematics Essential for Elementary Statistics. Henry Holt 
and Co. 

15. Yule and Kendall — The Theory of Statistics. Griffin and Co. 




CHAPTER I 

FREQUENCY DISTRIBUTIONS 

1. Variates. In general, statistics are obtained as the result of 
observations which aim to establish the magnitude of certain vari¬ 
ables. These magnitudes are sometimes called variates. For ex¬ 
ample, in computing the average monthly rainfall of a region the 
variable is rainfall and the amount of rainfall for any month is a 
variate. Likewise, if the bank clearings of the city of Evanston be 
under consideration, then the variable is bank clearings, and the 
clearings for any specified interval are variates. It is customary to 
denote a variable by x and its variates by x h x 2 , x 3 , • • *, x N . 

Variates are of two kinds: continuous and discrete . Continuous 
variates are those which may differ by infinitesimal amounts, such as 
heights, weights, temperatures, ages. All the numbers between 
x “ 0 and x — 1 form a set of continuous variates. But if we re¬ 
strict x to the rational numbers in this interval we have a set of sepa¬ 
rate and distinct values with i( vacant ” spaces between them. Values 
of a variable which are thus restricted to particular values in order to 
have any meaning are called discrete variates. Other examples of 
discrete variates are: size of families, closing prices of stocks, “ suc¬ 
cesses ” in tossing a coin. A set of discrete variates is usually ob¬ 
tained by counting whereas continuous variates are usually obtained 
by measurement. 

2. Accuracy of Measurements. In the case of continuous vari¬ 
ates, the observed values as recorded can never be absolutely estab¬ 
lished by measurement. Thus, the height or weight of an object can 
be measured only approximately, the error depending upon the pre¬ 
cision of the instrument and the care and accuracy of the observer. 
However, it is not always necessary that measurements be recorded 
as accurately as it is possible to make them. Similarly, in the case 
of discrete variates the standard of accuracy used may be less 
than it is possible to obtain. In population statistics, for example, 
it may be sufficient to record the numbers to the nearest 

7 


8 


Mathematics of Statistics 


thousand, with three zeros at the end to fill out to the decimal point. 
Thus, 

City Population 

A 326,000 

B 729,000 

On the other hand, the exact number of students in a university 
might be required. The degree of accuracy needed is determined 
by the purpose of the investigation and it is limited by the closeness 
with which the variables can be measured. 

It follows, therefore, that the degree of accuracy in the final result 
of a problem involving computations is limited by that of the original 
data. Students sometimes carry results of problems to five or more 
decimal places when the original data do not justify more than two 
or three decimal places. A table of measurements which constitutes 
the raw data for a statistical investigation should always specify the 
degree of accuracy in the readings. Thus, if monthly rainfall is being 
measured to the nearest hundredth of an inch, and one measurement 
seems to be exactly 5 inches, it should be recorded as 5.00 inches, with 
two zeros. A measurement that is merely recorded as 5 means it is 
correct to the nearest integer and its true value lies between 4.5 and 
5.5, whereas 5.00 means the true value is known to lie between 4.995 
and 5.005. The three digits in 5.00 are said to be significant. 

3. Necessity for Classification. After the data have been col¬ 
lected in any statistical investigation the first step has to do with 
introducing order in the raw material. Usually we have some hun¬ 
dreds of variates which have been recorded merely in the arbitrary 
order in which the observations or measurements happened to be 
made. But in order to analyze a series of variates so that intelligent 
judgments may be formed about it or that comparisons may be made 
between two series of variates, proper classification is necessary and 
of prime importance. 

Such classification is not always an easy thing to effect, because it 
is the one part of statistical methods for which no very definite rules 
can be given. Most people, until they have tried, imagine that to 
collect and arrange data in classes and in tables is a straightforward 
procedure involving no great technique or experience. Although 
much can be learned from a careful study of the illustrations and dis¬ 
cussions that appear in the following pages and the compilations of 




Frequency Distributions 9 

reputable bureaus such as the census volumes, nevertheless, experi¬ 
ence is the best teacher in effecting the most appropriate classification 
for any set of variates. 

4. Tabulation. In carrying out the process of classification, it 
becomes natural to arrange the results in tabular form, setting forth 
clearly and explicitly the statistics one wishes to present. In draw¬ 
ing up any table the following general rules should be observed: 

(1) Every table must be self-explanatory. To accomplish this 
the title should be short, but not at the expense of clearness. 

(2) Full explanatory notes, when necessary, should be incorporated 
in the table, either directly under the descriptive title and 
before the body of the table, or else directly under the form. 

(3) The columns and rows should be arranged in a logical order to 
facilitate comparisons. 

(4) In tabulating long columns of figures, spaces should be left 
after every five or ten rows. Long unbroken columns are con¬ 
fusing, especially when one is comparing two numbers in a 
row but in widely separated columns. 

(5) If the numbers tabulated have more than three significant 
figures, the digits should be grouped in threes. Thus, one 
should write 4 685 732, not 4685732. 

(6) Forms should be set off by double lines at the top and bottom. 
If the table nicely fills the width of the page, no side lines should 
be used. In such cases the omission of the side lines will have 
the tendency to emphasize the other vertical lines and cause 
the interior columns to stand out better. The columns should 
not be widely separated and the form of a narrow, compact 
table should have its side lines. 

5. Frequency Distribution. From the standpoint of a mathemati¬ 
cal analysis of statistics, the most important form of tabulation is 
the so-called frequency distribution. Rough data do not present 
any clear ideas of description unless they are organized and condensed 
in a systematic way. We therefore partition the raw data into 
classes of appropriate size, showing the corresponding frequency of 
variates in each class. When any set of statistics is systematically 
arranged in this way it is called a frequency distribution. For ex¬ 
ample, upon an examination of the raw data of Table 1, it is difficult 
to state any very definite conclusions as to whether these grades rep¬ 
resent preponderantly good students or poor ones. The frequency 


10 


Mathematics of Statistics 


Table 1 — Grades of 100 Students in Freshman Mathematics 


75 

86 

66 

86 

50 

78 

66 

79 

68 

60 

80 

83 

87 

79 

80 

77 

81 

92 

57 

52 

58 

82 

73 

95 

66 

60 

84 

80 

79 

63 

80 

88 

58 

84 

96 

87 

72 

65 

79 

80 

86 

68 

76 

41 

80 

40 

63 

90 

83 

94 

76 

66 

74 

76 

68 

82 

59 

75 

35 

34 

65 

63 

85 

87 

79 

77 

76 

74 

76 

78 

75 

60 

96 

74 

73 

87 

52 

98 

88 

64 

76 

69 

60 

74 

72 

76 

57 

64 

67 

58 

72 

80 

72 

56 

73 

82 

78 

45 

75 

56 


distribution of Table 2, however, does give us more precise infor¬ 
mation. We see at a glance that there were 32 students with grades 
between 70 and 80, and that all but 16 had grades of 60 or above. In 
Table 3, the confusion of detail is still more apparent. The corre¬ 
sponding frequency distribution is given in Table 4. 


Table 2 — Frequency Table of 100 Grades 


Class Limits 

Tally Marks 

Frequency 

30-39 

// 

2 

40-49 

/// 

3 

50-59 

■Mt-Mf / 

it 

60-69 

-Utf-JHi- -wr -Mt 

20 

70-79 

JW -W -Wt Jrtt // 

32 

80-89 

-Utt- -Wt -M/t- -W 

25 

90-99 

-UH- // 

7 

V Total 


iOO 


The width of a class is called the class interval, and in general 
the successive class intervals should be of equal width. The mid¬ 
value of such an interval is variously called the class mark, mid- 
value, central value. The width of a class interval is therefore 
seen to be the common difference between two consecutive class 
marks. It is also the difference between the lower (or upper) 
limits of two successive classes. Thus, in Table 4, the class inter¬ 
val is half an inch and the successive class marks are 0.245, 0.745, 
etc., inches. 



Frequency Distributions 11 

Table 3 — Monthly Rainfall at Iowa City, 1890-1925 
Year Jan. Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. Dec. 

1890 2.75 0.75 1.80 1.83 2.20 7.99 0.30 2.29 1.44 2.11 1.56 0.31 

1891 1.49 1.30 4.41 1.11 4.46 2.80 3.01 3.45 2.33 1.63 2.93 2.72 

1892 1.46 1.23 3.15 4.30 9.23 8.29 6.20 2.50 1.18 1.02 1.38 2.84 

1893 1.18 1.75 2.82 4.37 1.79 3.01 3.56 1.64 3.07 1.98 1.75 1.52 

1894 1.95 1.64 2.03 2.72 3.09 2.40 0.90 2.40 4.96 2.30 1.80 0.98 

1895 2.37 0.64 1.25 1.66 4.26 1.10 10.10 1.77 3.43 1.38 1.78 2.84 

1896 0.70 1.51 0.92 5.14 4.10 1.86 7.04 2.44 1.82 2.74 1.16 0.55 

1897 3.66 1.30 2.07 4.60 3.11 2.38 3.83 1.85 3.54 0.33 1.98 2.48 

1898 4.62 1.15 3.02 2.89 4.80 3.26 2.27 2.85 2.54 4.38 1.10 0.53 

1899 0.59 1.82 1.43 3.23 9.49 4.50 3.78 2.39 0.93 1.66 1.15 1.93 

1900 0.73 2.20 3.32 3.31 4.31 2.18 5.25 *6.27 4.35 3.61 1.43 0.75 

1901 1.07 1.97 3.62 2.36 1.54 3.33 1.29 0.66 2.56 1.78 0.79 2.34 

1902 1.29 0.85 1.29 1.91 3.75 7.46 6.89 10.91 5.87 3.12 2.25 2.21 

1903 0.67 1.03 1.86 3.11 6.90 1.95 4.76 3.45 5.38 3.60 0.97 1.27 

1904 1.74 0.84 2.73 5.49 2.68 2.14 2.49 3.93 3.12 1.59 0.25 1.96 

1905 1.22 1.90 2.28 3.36 5.37 6.68 3.59 2.62 1.54 5.36 2.92 1.04 

1906 2.51 1.73 2.25 1.83 2.33 3.64 1.42 5.34 0.89 1.48 3.08 1.64 

1907 2.12 0.22 1.59 1.58 5.47 6.04 9.21 2.98 2.85 0.86 1.07 0.53 

1908 0.32 2.08 2.94 2.78 7.78 2.87 5.40 7.47 1.82 1.99 1.84 0.43 

1909 1.97 1.09 2.00 7.21 4.40 4.58 5.75 1.88 2.43 1.59 4.88 2.52 

1910 1.79 0.39 0.28 2.56 3.57 0.98 2.22 4.98 3.87 0,57 0.69 0.46 

1911 0.87 4.82 1.30 3.02 4.74 2.98 3.70 4.27 5.07 2.78 3.01 2.29 

1912 0.26 1.21 2.30 3.50 2.88 2.60 3.60 3.62 2.67 3.54 1.11 0.75 

1913 1.19 1.42 2.69 1.83 6.91 6.28 0.39 2.97 3.19 3.66 0.46 1.02 

1914 1.28 0.93 2.63 2.37 4.87 5.32 1.53 2.99 7.97 1.65 0.37 1.89 

1915 2.15 2.42 0.92 0.65 7.65 4.33 8.11 1.80 9.31 1.84 1.80 0.80 

1916 3.18 0.59 5.06 1.83 5.99 3.92 1.57 2.83 3.49 3.19 1.42 1.15 

1917 1.09 0.19 2.19 3.43 7.33 6.49 2.84 2.79 6.23 2.28 0.30 0.57 

1918 1.10 1.46 0.33 3.43 6.22 8.36 4.87 6.72 2.00 2.05 2.10 1.62 

1919 0.08 2.63 2.65 4.28 4.49 7.07 1.03 2.67 5.10 4.01 3.84 0.61 

1920 0.84 1.33 4.22 4.75 3.76 2.86 2.79 2.90 1.20 0.98 1.80 2.45 

1921 0.35 0.49 2.46 6.20 4.44 2.46 3.59 8.61 7.83 2.47 0.74 3.19 

1922 1.11 1.46 2.18 3.49 5.52 0.28 6.46 1.03 2.91 1.06 5.28 0.49 

1923 1.09 0.67 4.83 0.86 2.63 6.21 2.37 4.01 9.27 2.35 1.13 0.73 

1924 1.35 0.83 2.10 1.09 1.69 8.71 3.67 5.67 2.60 1.64 0.93 1.75 

1925 0.29 1.04 0.99 3.07 1.06 5.61 3.63 3.14 5.59 3.90 1.00 1.66 


12 


Mathematics of Statistics 


Table 4 — Frequency Table of Monthly Rainfall at Iowa City, 

1890-1925 


Class Interval 

Mid-x 

Frequency 

0.00- 0.49 

0.245 

23 

0.50- 0.99 

0.745 

42 

1.00- 1.49 

1.245 

58 

1.50- 1.99 

1.745 

62 

2.00- 2.49 

2.245 

49 

2.50- 2.99 

2.745 

47 

3.00- 3.49 

3.245 

32 

3.50- 3.99 

3.745 

27 

4.00- 4.49 

4.245 

18 

4.50- 4.99 

4.745 

15 

5.00- 5.49 

5.245 

14 

5.50- 5.99 

5.745 

7 

6.00- 6.49 

6.245 

10 

6.50- 6.99 

6.745 

5 

7.00- 7.49 

7.245 

6 

7.50- 7.99 

7.745 

5 

8.00- 8.49 

8.245 

3 

8.50- 8.99 

8.745 

2 

9.00- 9.49 

9.245 

5 

9.50- 9.99 

9.745 

0 

10.00-10.49 

10.245 

1 

10.50-10.99 

10.745 

1 


Total 432 


6. Class Intervals. Grouping variates into the most appropriate 
number of classes is a matter of judgment. The choice of intervals 
to be used in tabulating any particular set of variates depends upon 
the nature and characteristics of the data and the purpose for which 
it is to be used. In the case of discrete variates, the unit is a natural 
interval and sometimes it is satisfactory. (See Tables 10 and 11.) 
However, for both discrete and continuous variates the following 
conditions should guide the choice: (a) We desire to be able to treat 
all the values assigned to any one class, without serious error, as if 
they were equal to the class mark for that interval; e.g., as if all 
23 items in the first class of Table 4 were exactly 0.245 inches, etc. 
(' b ) For convenience and brevity we desire to make the interval as 
large as possible subject to the first condition. These conditions will 


Frequency Distributions 

Table 5 — Monthly Rainfall at Des Moines, 1890-1925 


13 


Year 

Jan. 

Feb. 

Mar. 

Apr. 

May June 

July Aug. Sept. 

Oct. 

Nov. 

Dec. 

1890 

2.62 

1.17 

0.91 

0.78 

3.00 

4.91 

1.10 

3.35 

1.57 

4.48 

0.74 

0.11 

1891 

1.82 

1.13 

2.25 

2.12 

3.29 

5.60 

2.78 

4.22 

1.64 

2.41 

1.34 

1.54 

1892 

1.60 

1.35 

2.47 

3.36 

8.77 

3.41 

8.64 

2.45 

1.12 

2.54 

0.76 

1.95 

1893 

0.56 

1.28 

1.15 

5.61 

2.84 

4.69 

3.55 

1.60 

1.33 

0.22 

1.51 

1.30 

1894 

1.09 

1.39 

1.78 

1.70 

1.41 

1.67 

0.29 

1.89 

4.46 

2.24 

0.99 

1.15 

1895 

1.30 

0.60 

0.50 

3.41 

2.86 

5.26 

3.10 

3.57 

3.20 

0.29 

0.85 

1.86 

1896 

0.60 

0.79 

1.24 

3.47 

6.50 

2.69 

8.15 

5.49 

3.61 

2.69 

1.10 

0.85 

1897 

2.02 

0.71 

2.13 

7.37 

2.31 

3.15 

2.88 

1.77 

1.56 

0.85 

0.34 

1.98 

1898 

1.59 

0.82 

1.35 

2.64 

4.22 

6.85 

1.86 

1.09 

1.91 

3.56 

1.87 

0.57 

1899 

0.29 

0.57 

1.04 

2.22 

6.71 

3.53 

3.20 

3.53 

1.17 

0.59 

1.76 

2.12 

1900 

0.20 

0.50 

3.07 

3.82 

4.76 

4.89 

5.15 

8.02 

3.66 

3.08 

0.96 

0.35 

1901 

1.01 

1.11 

3.02 

2.26 

1.40 

2.41 

1.72 

0.67 

2.60 

2.14 

0.40 

1.03 

1902 

0.91 

0.52 

1.15 

1.55 

4.69 

7.27 

5.95 

7.82 

5.03 

3.70 

1.65 

1.77 

1903 

0.20 

1.12 

1.09 

1.64 

0.64 

3.06 

3.62 

6.72 

1.62 

1.32 

0.31 

0.09 

1904 

1.22 

0.22 

1.20 

5.48 

3.16, 

2.08 

6.94 

2.60 

1.95 

1.50 

0.06 

2.02 

1905 

1.08 

1.00 

2.16 

3.29 

4.44 

5.73 

4.53 

5.21 

3.47 

3.64 

2.34 

0.55 

1906 

2.07 

0.86 

1.84 

2.96 

2.21 

3.80 

2.67 

4.69 

3.24 

1.18 

2.29 

1.46 

1907 

0.87 

0.93 

1.18 

1.48 

2.97 

4.13 

10.20 

5.03 

2.40 

1.70 

1.12 

1.01 

1908 

0.46 

1.15 

1.43 

2.69 

9.89 

5.93 

1.56 

6.54 

0.94 

3.68 

0.95 

0.31 

1909 

1.61 

0.90 

1.56 

5.14 

4.24 

7.01 

4.41 

0.14 

2.06 

2.89 

3.71 

2.32 

1910 

1.72 

0.20 

0.33 

1.13 

3.26 

3.11 

0.86 

2.40 

3.82 

0.68 

0.53 

0.20 

1911 

0.84 

2.91 

1.14 

4.23 

2.44 

0.75 

1.16 

1.82 

7.68 

2.61 

1.22 

3.18 

1912 

0.53 

1.86 

2.87 

2.75 

5.62 

2.60 

3.07 

3.52 

4.20 

3.75 

1.11 

0.30 

1913 

1.10 

0.65 

3.03 

3.41 

5.06 

3.52 

1.05 

3.44 

2.65 

2.67 

1.03 

1.05 

1914 

0.85 

1.24 

1.18 

1.52 

4.83 

3.89 

1.22 

1.77 

4.81 

3.57 

0.35 

1.28 

1915 

1.96 

3.20 

1.16 

1.36 

8.21 

3.60 

9.39 

1.71 

4.51 

0.43 

1.24 

0.65 

1916 

2.66 

0.61 

0.60 

2.44 

3.87 

2.42 

1.50 

2.62 

1.72 

2.11 

1.46 

0.65 

1917 

6.53 

0.52 

2.30 

5.52 

3.94 

8.16 

1.58 

1.82 

1.99 

0.92 

0.21 

0.88 

1918 

0.78 

1.45 

0.29 

1.81 

5.87 

5.63 

1.18 

2.54 

0.91 

3.81 

2.10 

1.35 

1919 

0.08 

3.00 

3.67 

5.30 

2.96 

7.36 

2.68 

2.19 

7.47 

2.20 

3.84 

0.93 

1920 

0.44 

0.74 

3.92 

4.09 

3.14 

1.25 

5.66 

2.11 

4.44 

1.89 

1.63 

1.38 

1921 

0.59 

0.92 

1.07 

3.72 

3.62 

4.66 

2.49 

6.63 

7.16 

1.51 

0.35 

0.80 

1922 

0.85 

0.64 

2.25 

2.84 

6.87 

1.63 

7.13 

6.63 

3.00 

3.41 

2.54 

0.25 

1923 

0.88 

0.36 

4.34 

1.76 

4.78 

4.95 

0.78 

5.34 

5.17 

1.10 

0.55 

0.61 

1924 

1.02 

1.98 

3.10 

0.78 

1.26 

9.30 

0.98 

4.15 

3.47 

0.77 

0.53 

1.62 

1925 

0.23 

0.50 

0.88 

1.64 

0.77 

6.40 

2.21 

4.79 

3.75 

3.22 

0.32 

1.67 



14 


Mathematics of Statistics 


generally be fulfilled if the interval be so chosen that the whole num¬ 
ber of classes lies between 10 and 25. A small number of classes 
leads in general to very appreciable inaccuracy, and a large n umb er 
makes a somewhat unwieldy table. A preliminary inspection of the 
data should accordingly be made and the highest and lowest values 
selected. Dividing the difference between these by the tentative 
number of classes, we have our approximate value of the interval. 
After a little preliminary reconnoitering an appropriate number of 
classes and their limits can be determined. Thus, in Table 3, the 
highest value noted was 10.91 and the lowest 0.08 (verify). The 
difference between these is 10.83, which suggests that if we took 20 
classes we would have approximately a half inch as the width of a 
class interval. This, however, assumes we would start with 0.08 as 
our lower limit, which would give us awkward figures as limits. 
Therefore, our judgment suggests it would be better to start with 0 
and continue by half-inch intervals as far as is necessary to take in 
the range of the given variates. We have estimated it will take 
approximately 20 of these; actually it turns out to be 22. This num¬ 
ber of intervals and their width is consistent with the general condi¬ 
tions'(a) and (6) given above. On page 15 are given some supple¬ 
mentary rules which in general are helpful in making a frequency 
distribution. 

7. Distinction between Class Limits and Class Boundaries. The 

pairs of numbers written in the column of classes of a frequency dis¬ 
tribution are the lower and upper class limits, sometimes called open 
class limits. For instance, 1.00-1.49 are the limits of the third class 
of Table 4. When the measurements of Table 3 were made, rea ding s 
were recorded to the nearest hundredth of an inch. Thus, a measure¬ 
ment which was more than 1.485 and less than 1.495 was recorded 
as 1.49. Likewise, if a measurement was more than 0.995 but less 
than 1.005, it would be recorded as 1.00. Therefore, the third class 
of Table 4 includes all measurements more than 0.995 and less than 
1.495. These values are then the true or closed limits of the third 
class and are known as class boundaries or end values. A class bound¬ 
ary is the value halfway between the upper limit of one class and the 
lower limit of the next class. For example, the upper boundary of 
the fourth class of Table 4 is 1.995 which is the lower boundary 
of the fifth class. If we denote the variate values by x, the 
following table illustrates these remarks for the first five classes of 
Table 4. 




Frequency Distributions 


15 


Class Limits End-x Mid-x 

0 . 00 - 0.49 0.495 0.245 

0 . 50 - 0.99 0.995 0.745 

1 . 00 - 1.49 1.495 1.245 

1 . 50 - 1.99 1.995 1.745 

2 . 00 - 2.49 2.495 2.245 

The width of a class interval is the same, however, whether the 
classes be expressed in terms of class limits or class boundaries, being 
the difference between the beginning of one class and the beginning 
of the next class. Similarly, the class mark as the mid-point of the 
interval is unaffected. Thus, for the class limits 1.00-1.49, the 
class mark is §(1-00 + 1.49) = 1.245; for the corresponding class 
boundaries, the class mark is | (0.995 + 1.495) = 1.245. 

The distinction between class limits and class boundaries is an 
important one in plotting graphs, but in tabulating it is the class 
limits that should be expressed. 

8. Rules for Making a Frequency Distribution. 

(1) Determine the range of the table by finding the difference be¬ 
tween the highest value and the lowest value among the items. 

(2) Determine the number of equal parts into which the range 
shall be divided. The size of the class interval and the num¬ 
ber of intervals depend upon the size and nature of the distri¬ 
bution. (Table 1 contains rather fewer classes than is usually 
desirable but an interval of 10 units is quite conventional in 
students’ grades. An interval of 5 would be used if grades 
of A, A —, B, B —, etc., were given instead of A, B, etc.) In¬ 
tervals of 0.5, 1, 2, 3, 5, 7, or 10 are the most common. 

(3) Arrange a sheet with three headings: class interval, tally 
marks, frequency. 

(4) Read off the items in the raw table and for each one record a 
mark, as shown in Table 2. 

(5) Write the sum of the marks in each row in the frequency col¬ 
umn. The sum of the frequencies should, of course, equal the 
total number of variates. 

9. Cumulative Frequencies. The frequencies with which we have 
been concerned may be called absolute frequencies to distinguish 
them from two other kinds which will be mentioned in this course; 
namely, cumulative frequencies and relative frequencies. The first 
of these will be considered here. 




Mathematics of Statistics 


16 


Table 6 — Distribution of Intelligence Quotients (IQ s) of 905 School 
Children from 5 to 14 Years of Age. (Derived from 
L. M. Term an, The Measurement of Intelligence) 

IQ Number of Children 


55- 64 3 
65- 74 21 
75- 84 78 
85- 94 182 
95-104 305 
105-114 209 
115-124 81 


125-134 21 

135-144 5 

Sometimes a statistical investigation is concerned with the number 
or percentage of variates which are “ less than ” or “ more than 
a given value. This is frequently the case in educational tests and 
in wage or salary statistics. Our chief interest in such cases may be 
the accumulated frequency of the several class intervals up to some 
class boundary. Hence we are led to form a cumulative frequency 
table. Such a table is built up by successively addmg the several 
(absolute) frequencies; thus: fi, fi + fa, /i + /s + fh e t°v as illus¬ 
trated in Table 7, where the data of Table 6 are used. 


Table 7 — Cumulative Distribution of IQ’s (Table 6) 


Clctss Mark 
Mid-x 

Frequency 

f 

Upper Boundary 
End-x 

Cum f 

Cumf 

N 



54.5 

0 

0.000 

59.5 

3 = /i 

64.5 

3 =/, 

0.003 

69.5 

21 =/ 2 

74.5 

24 = A + h 

0.027 

79.5 

78 

84.5 

102 

0.113 

89.5 

182 

94.5 

284 

0.314 

99.5 

305 

104.5 

589 

0.651 

109.5 

209 

114.5 

798 

0.882 

119.5 

81 

124.5 

879 

0.971 

129.5 

21 

134.5 

900 

0.994 

139.5 

5 

144.5 

905 = N 

1.000 


The cumulative frequency (cum f) at any class is the total (abso¬ 
lute) frequency up to the upper boundary of that class. This is the 
reason for placing the cumf entries opposite the end-x values and on 
lines between the mid-x entries. Thus, in the cum f column of Table 


Frequency Distributions 17 

7, three students had IQ’s less than 64.5, 24 less than 74.5, etc. The 
entries in the column headed ( cumf)/N give the percentages of the 
total frequency which are less than the values in the end-x column. 
Thus, from this column in Table 7, we can readily see that 88% of 
the children had IQ’s less than 114.5 and only 11% less than 84.5. 

Table 7 is known as a “ less than ” table. One could of course 
cumulate the frequencies from the bottom of the table, getting a 
“ more than ” distribution. The cumf column would then give the 
number of children whose IQ’s are more than the values at the lower 
boundaries of the several class intervals. 

The inverse operation to cumulating the frequencies is called 
“ differencing ” and is usually denoted by A (delta). If S denotes 
any series of values, then AS denotes the results obtained by sub¬ 
tracting the first value of S from the second value, the second from 
the third, etc. Differencing a column of cumulative frequencies 
obviously gives the absolute frequencies. Differencing a column 
of ( cumf)/N values gives the//iV values. 

Exercises 

1. What is the width of the class interval and the values of the class marks 

in Table 2? 

2 . Tabulate the grades of Table 1, using class intervals of 5 units. 

3 . With reference to Table 3, is it easy to answer such questions as the following: 

(a) In how many instances was the monthly rainfall between 2 inches and 
3 inches? 

(b) In how many instances was the rainfall less than 5 inches? 

(c) What was the smallest monthly rainfall recorded? 

(d ) What per cent, of the total measured between 5 inches and 10 inches? 

(e) What measurement is the most common? 

4 . Refer to Table 4 and then answer the above questions. 

5. Using your own judgment as to the most appropriate class interval, make 

a frequency distribution of the monthly rainfall for Des Moines from 
1890 to 1925 (Table 5). 

6. For Table 6 state the class boundaries (end values) and the class marks. 

7 . Difference the cum f column of Table 7. 

8. Read the following references: 

(а) Mathematics Essential for Elementary Statistics — Walker, Chapter II. 

(б) Standards and Requirements in Statistics — Belcher. J ournal American 
Statistical Association vol. 21, p. 424. 

10. Additional Distributions. The following distributions which 
will be referred to in subsequent chapters will serve as illustrative 
and laboratory material. They are not chosen on account of the 
importance of the data but merely to exemplify methods. 


18 


Mathematics of Statistics 


Table 8 — Distribution of Lengths of 
995 Telephone Calls. Time in Seconds 

Time Number of Calls 


0- 99 

1 

100-199 

28 

200-299 

88 

300-399 

180 

400-499 

247 

500-599 

260 

600-699 

133 

700-799 

42 

800-899 

11 

900-999 

5 


Table 9 — Distribution of Weight in Pounds Among 
1000 8-Year-Old Glasgow Schoolgirls 


Weight ( mid-values ) 

Frequency 

29.5 

1 

33.5 

14 

37.5 

56 

41.5 

172 

45.5 

245 

49.5 

263 

53.5 

156 

57.5 

67 

61.5 

23 

65.5 

3 


Table 10 

Twelve dice were thrown 4096 times; only a throw of 6 was counted a success. 
The observed distribution follows: 

Successes Frequency 
Xi Ji 

0 447 

1 1145 

2 1181 

3 796 

4 380 

5 115 

6 24 

7 7 

8 1 

9 0 

10 0 

11 0 

12 0 

(For future reference: 5 = 2, a — 1.296) 



Frequency Distributions 


19 


Table 11 

Twelve dice were thrown 4096 times; a throw of 4,5, or 6 points being reckoned 
a success. The following distribution was recorded: 

Successes Frequency 


0 

0 

1 

7 

2 

60 

3 

198 

4 

430 

5 

731 

6 

948 

7 

847 

8 

536 

9 

257 

10 

71 

11 

11 

12 

0 


(For future reference: x ~ 6.139, <r = 1.712) 


Table 12 — Frequency Distribution of the Weights of 1000 Male 
Students (Original Measurements Made to Nearest Half Pound) 


Frequency 


Cumulative 

Frequency 


Class 

Class 

Pounds 

Mark 

90- 99.5 

94.75 

100-109.5 

104.75 

110-119.5 

114.75 

120-129.5 

124.75 

130-139.5 

134.75 

140-149.5 

144.75 

150-159.5 

154.75 

160-169.5 

164.75 

170-179.5 

174.75 

180-189.5 

184.75 

190-199.5 

194.75 

200-209.5 

204.75 

210-219.5 

214.75 

220-229.5 

224.75 

230-239.5 

234.75 

240-249.5 

244.75 


21 

23 

104 

127 

196 

323 

248 

571 

197 

768 

133 

901 

47 

948 

25 

973 

14 

987 

7 

994 

4 

998 

0 

998 

0 

998 

1 

999 

1 

1000 


(For future reference: x - 138.65, a = 18.03, = .94) 


20 Mathematics of Statistics 


Table 13 — Distribution of Span (Central Values) in Inches Among 2000 
Adult Males (Original Measurements to the Nearest Inch) 


Span 

Frequency 

Span 

Frequency 

58.5 

1 

71.5 

217 

59.5 

2 

72.5 

176 

60.5 

1 

73.5 

132 

61.5 

6 

74.5 

82 

62.5 

7 

75.5 

48 

63.5 

22 

76.5 

20 

64.5 

55 

77.5 

16 

65.5 

111 

78.5 

12 

66.5 

146 

79.5 

3 

67.5 

182 

80.5 

1 

68.5 

229 

81.5 

2 

69.5 

265 

82.5 

1 

70.5 

263 

Total 

2000 



CHAPTER II 

GRAPHICAL REPRESENTATION 

1, The Function Concept. The primary purpose of a graph is to 
show how one quantity varies with another. The two related quanti¬ 
ties may, for example, be population and time, frequency and variate 
value, accumulated amount of principal and rate of interest, insur¬ 
ance premium and age of the insured. One of the most useful appli¬ 
cations of the graph occurs in connection with the representation of 
statistical data. Underlying the intelligent use of graphs is the con¬ 
cept of function , which is a fundamental notion in mathematics and 
its applications. The mathematical meaning of function is a tech¬ 
nical one, entirely different from the ordinary meaning. The stu¬ 
dent usually meets the word for the first time in algebra, when a 
linear or quadratic expression is spoken of as a function of x. An 
example is the equation 

V = P( 1 + x)\ 

The expression on the right is the function of x (P being constant) 
and for convenience it is denoted by the single letter y. Here x is an 
interest rate and y denotes the amount to which P dollars will accu¬ 
mulate in two years at x% per year. The fact that y is a function 
of £ is expressed by some such notation as 

V = fix). 

In place of / other letters may be used, the most common being g, 
F , and <t >. 

Examples: 

f(x) = 5x 2 - 3x + 2, 

<f>(x) = Ke~ x \ 

Any mathematical expression involving a variable x is a function 
of x. However, the word is often used to designate a relation that is 
completely divorced from any equation or expression. The central 
idea conveyed by this more general meaning is that of a correspon- 

21 



22 


Mathematics of Statistics 


dence between values of y and values of x. The following definition is 
the result of a development over a long period and its formulation is 
due to Dirichlet, a famous French mathematician (1805-59). 

Definition. Let there he a set of values assumed by the inde¬ 
pendent variable x. If to each x in the set, there corresponds one or 
more values of y, then y is said to be a function of x in the set. 

It should be observed that this definition 1 is freed from any notion 
of the necessity of specifying the mathematical relation between x 
and y. We may or may not know the special method by which the 
correspondence is set up. A mathematical formula or equation be¬ 
tween x and y may not even exist. A function may thus be 
considered as being equivalent to a table in which one may look up 
any x of the set of the definition, and find the corresponding y . 

Much of the data in statistics comes under this general definition 
of function. Thus, in the following table, net earning is a function 

of the year, whether or not there 
is any equation defining that 
functional relationship. 

Here the function is defined 
only for the indicated points 
which correspond to the values 
given in the table. The straight 
lines are drawn to help the 
reader visualize the relative posi¬ 
tions of these values and not to represent the function at inter¬ 
mediate points. They may, however, be thought of as a first 
approximation to the unknown function between the given values. 
Such a representation of the function could not, of course, be 
assumed in the case of discrete variates because then the function 
is discontinuous and does not exist except for the given values. 

Referring again to the above definition, if there is only one value 
of y corresponding to each value of x then y is called a single-valued 
function of x) otherwise y is said to be a multiple-valued function of 
x . Child weight would be an example of a multiple-valued function 
of age, being different for different children. The weight of a par¬ 
ticular child would be a single-valued function of age. For the most 
part we shall be concerned with single-valued functions. 

1 A classical example is the function which is defined for the infinite set of 
numbers from x = 0toa: = ltobe unity for all rational numbers and zero for 
all irrational numbers. 









/ 


A 




7m 

it Earn 
r ’i 

iings | 



□ 

r~ 


35'-J-J- f -H 

1924 25 *26 27 28 


Year 

Millions 

1924 

45 

#9 25 

43 

1926 

49.6 

1927 

51.5 

1928 

57.3 





Graphical Representation 23 

2. Charts. A detailed study of the technique of representing data 
by broken lines, by charts or bar graphs, etc., will not be undertaken 
here. It is a rather specialized and non-mathematical subject, and 
the student interested in plain-scale cartography can readily find 
books on the subject which are very readable. 1 (A discussion of 
ratio charts is given in Chapter VII.) 

3. Frequency Polygon. We present now a discussion of the 

graphs that are used in connection with frequency distributions. A 
distribution of discrete variates may be represented graphically by 
plotting the points (x 2 ,/ 2 ), * • • (x k ,f k ), and drawing a broken 

line through them. Such a graph is called a frequency polygon be- 



I Frequency Polygon for the Distribution of Table 10 
n Frequency Polygon for the Distribution of Table 11 


Fio. 1 — Frequency Polygons for Distributions of Discrete Variates 

cause it is a polygon formed by connecting the tops of a series of 
ordinates whose lengths are proportional to the various frequencies 
and whose abscissas correspond to the variate values of the distri¬ 
bution. Figure 1 will serve as an illustration. For a table of dis¬ 
crete variates the function exists only for the given values. Like¬ 
wise, its graph is discontinuous. The straight lines connecting the 
points serve merely to “ carry the eye/' thus giving a better idea of 
the shape and position of the distribution. 

4. Histogram. If the frequency distribution is one of grouped 
variates (discrete or continuous) it is better to use some form of 
graphical representation which recognizes the fact that the several 

*For example, 1. Principles and] Methods of Statistics — Chaddock, p. 418. 
2. Graphic Charts in Business — Haskell. 




24 


Mathematics of Statistics 


measurements in a table do not lie precisely at the class marks but 
are spread out over the intervals of which the class marks are centers. 
This may be accomplished through the use of a histogram. A histo¬ 
gram is a series of rectangles erected at the class boundaries with 
altitudes proportional to the respective class frequencies, and cen¬ 
tered on the class marks. Thus the frequencies are represented by 
areas. (See Figure 2.) If the bases are all of unit length then the 
altitudes are also equal to the frequencies. The histogram is an 
important and useful graphical device for representing frequency 
distributions. 



5. Frequency Curves. The shape of the distribution may be 
emphasized by constructing a continuous frequency curve such that 
the areas under the curve between the ordinates at the upper and 
lower boundaries of the various rectangles will equal approximately 
the areas of the corresponding rectangles. Thus, in Figure 3, the 
area of all the rectangles represents the total frequency 1000, and the 
area of the three rectangles labeled A, B, C, represents the number 
of individuals weighing between 139.75 pounds and 169.75 pounds. 
The dotted line represents roughly the frequency curve correspond¬ 
ing to the histogram. 

Representing each class frequency of a distribution of continuous 
variates by a rectangle is equivalent to saying that we realize that 
the function exists for points other than the class marks, but we do 
not know what it is for these points, and so as a first approximation 
we assume that the variates are uniformly distributed over each 



Graphical Representation 25 

interval, which is equivalent to regarding them as concentrated at the 
class marks. If the class intervals were made smaller and smaller 
and at the same time the number of variates were proportionally 
increased, the upper bases of the rectangles would approach more 
and more the frequency curve which represents the ideal or theoreti¬ 
cal mathematical function relating frequency with variate value 
for the given distribution. 

A frequency curve is often drawn for convenience in describing 
the properties of an observed distribution, although strictly speak¬ 
ing, the concept of a frequency curve is only applicable to an infinite 



Frequency Distribution of the Weights of 1000 Male Students (Table 12) 

Fig. 3 — Histogram and Frequency Curve for a Distribution 
of Continuous Variates 


“ universe ” of continuous variates. The data at hand are supposed 
to be a “ sample ” from the universe represented by the frequency 
curve. 

The more common types of distributions may be represented by 
bell-shaped curves which are either symmetrical or skew. For ele¬ 
mentary purposes it is sufficient to consider frequency distributions 
as of these two general types. In passing, we may also mention two 
other types which are known as J-shaped and U-shaped. For ex¬ 
amples of these types see Yule and Kendall, An Introduction to the 
Theory of Statistics, Ch. VI. 



26 


Mathematics of Statistics 


6. Ogives. The graphs of cumulative frequencies are called 
ogives. The ogive for Table 7 is shown in Figure 4 and is constructed 
by plotting the points (54.5, 0), (64.5, 3), etc., as in algebra, and 
joining them with straight lines. 

The student should observe that while cum f is a function of x it is 
defined for the end-x values only. Occasions will perhaps arise when 
we desire the x-value corresponding to some intermediate cum f 
value, say 453 in Figure 4. Conversely, we might wish to know the 
cum f value for some intermediate z-value, say at x = 97. Strictly 
speaking, we do not know the answer in either case, inasmuch as we 
do not know how the IQ’s are distributed over the interval. Per¬ 
haps all the individual values in jthe interval 94.5-104.5 (say) are 



Fig. 4 — Ogive for Table 7 


less than 97; perhaps none are. The fairest assumption we can make 
is that they are uniformly distributed throughout the interval. This 
means graphically that we represent cum f over each interval by a 
straight line, as is done in Figure 4. We may now interpolate under 
this line for intermediate values. This is “ straight line interpola¬ 
tion ” and is what the student uses when he interpolates in logarithms. 

More refined methods exist for interpolating values of a function 
between the observed values but their study constitutes a separate 
branch of mathematics beyond the scope of this course. It should 
be observed that the straight line used here is a first approximation 
to the unknown function, and not merely a device to carry the eye 




Graphical Representation 27 

as in the case of a frequency polygon for a discontinuous distribution 
of discrete variates. 

7. Relation of Cum f to Areas. The sum of the frequencies ( cumf ) 
up to any value of x means, graphically, the sum of the areas of 
the rectangles of the histogram up to that value. Thus in Figure 4, 
the ordinate erected at x = 84.5 represents the sum of the frequencies 
(3 + 21 + 78) = 102 (Figure 2). If a frequency curve represents 
the distribution, then cum /, corresponding to any value of x, is the 
area under the curve up to that value. Thus, in Figure 3, cum f 
corresponding to x = 139.75 is approximately the area under the 
smooth curve up to x = 139.75, and the total area under the curve 
is cum f = N. 


Exercises 

1. (a) lif(x) — 2x 2 what is f(—x); /(3); /(—2)? 

(b) Given a function f(x), what is the graphical meaning of /(c) where c is 
any real number? 

2. Make a histogram for the data of Table 4. 

3. Do likewise for Table 8 or 9. 

4. Construct an ogive for the cumulative frequencies given in Table 12. 

5. Find the cumulative frequencies and construct the ogive for Table 9. 

6. For further discussion of ogive curves and their uses, read the following 

references: 

(а) Elements of Statistics — Davis and Nelson, pp. 23-28. 

(б) The Mathematics of Statistics — Burgess, pp. 61-72. 



CHAPTER III 
AVERAGES 

1. It was pointed out in Chapter I that classification of the vari¬ 
ates of any long series is the first step necessary to overcome the 
confusion of detail in the original observations, and to make compari¬ 
sons with other distributions possible. In Chapter II graphical 
methods were studied which describe, to some extent, the shape and 
position of the distribution. Although these methods are helpful, 
their contribution is largely qualitative. 

It is desirable to formulate quantitative descriptions for character¬ 
izing a distribution, and as an aid in this direction averages are very 
useful. An average is a single value measuring the central tendency 
of the distribution. In a sense, it is a typical value of the whole set 
of variates, although it is not necessary that it actually have the value 
of one of the items of the set it represents. There are five averages 
in common use. These are: arithmetic mean , mode , median , geometric 
mean , and harmonic mean. The mean and median are most fre¬ 
quently used although the arithmetic mean is by far the most impor¬ 
tant in general statistical work, and the others are of service in special 
cases. We will consider them in the order named. First however, 
it will be desirable to discuss certain symbols and notation which will 
facilitate the development of formulas. 

2. Notation. If x denotes a variable, then xi, x 2 , • • •, x Ny are 
general symbols for the values which x may take. When we are con¬ 
cerned with a sum like the following, 

X\ + X 2 + X3 + £4 + • • • + Xi + • • • + Xn , 

it is customary to designate it by placing the Greek capital letter 23 
(sigma) before the general term, thus 

N 

Y*i = X\ + X 2 + • * • + Xi + • • • + Ztf. 
i = 1 

The symbol 23 is a sort of mathematical verb and the notation 
written above and below it may be called adverbs. Mathematicians 
call 23 an operator and speak of the “ adverbs ” as limits. When 

28 




Averages 


29 


N 

2 is placed before any quantity, it means, “ add up all quantities 

like • • • which are formed by giving i the values of every positive inte¬ 
ger from i = 1 to i = N, inclusive.” Thus if x» stands for “ variates” 
in Table 1, x x refers to the first value 75, x 2 refers to the second value 

80, etc., and x N refers to the last value 56. Here N = 100. Hence 
100 

the compact notation ^x* denotes the sum of all the variates in 

i = 1 
N 

Table 1. The symbol ^is read, “ the summation of x-sub-f, i 

varying (or running) from one to N.” The subscript i is called the 
index of summation. Any letter may be used as an index but it is 
conventional to use i or j. Also the upper limit may be denoted by 
any letter but we shall use N to denote the total number of variates 
(some of which may be alike) in a set. 

If a variable x is to take on the particular values, 1, 2, 3, etc., 
instead of the general values xi, x 2 , x 3 , etc., then x itself becomes the 
index of summation and we write x = 1 underneath Thus 

N 

^2, x = 1 + 2 + 3 + • • • + N y 

x—l 

N 

I> 2 = 1 + 2 2 + 3 2 +-b N\ 

x = X 

Frequently the index of summation is understood from the context 
and the notation at the top and bottom of ^ may be omitted if no 
ambiguity results. 

It is imperative that the student master, as soon as possible, the 
significance and utility of the X) notation. 

Illustrations: 

N 

1. ~• 3^1 + 3^2 + • • • + 3xjv 

= 3(xi + X 2 + • • • + Xjv). 

5 

2. J2(x< + c) — (Xl + c) + (%2 + c) + (x 3 + c) 

l =1 

+ (x 4 + c) + (x 5 + c) 

= (xi + X 2 + X 3 + X 4 + Xg) + 5c. 

4 

iLpjfi = xifi + X 2/2 + x 3 / 3 + X 4 J 4 . 


3. 


30 


Mathematics of Statistics 


4 

4. + ^22/2 + *3?/3 + a:42/4. 

i«1 

iV 

5. = l 2 + 2 2 + 3 2 H-+ AT 2 . 

U =1 

The following simple theorems will be useful in our work. 

Theorem I. The summation ^2 of an algebraic sum of two or more 
terms is the same algebraic sum of the 'JT’s of these terms taken separately . 
In symbols: 

N N N N 

Yl( x < + 2 /< — 2 .) = J 2 x i + — J 2 z i- 

i—l t=l t = l 4=1 

Theorem II. A constant factor may be removed from under the 
summation sign and written outside as a factor . Thus, 

N N 

ZcXi = C^Xi. 

1=1 1=1 

Proofs: It is left as an exercise for the student to prove these two 
theorems by expanding the expressions. 

N 

Theorem III. If the expression under is a constant c } the expanded 

1=1 

result is Nc. 

Examples: 

N 

1. = c + c + " i *fc = Nc . 

t~i 

N N N 

2. S(x* — c) = — £c, by Theorem I. 

1=1 1=1 1=1 

N 

— ^Xi — Nc, by Theorem III. 

i=i 

The above theorems hold also if we replace the notation 

N N 

^ 2 xi by etc. 

1=1 X=1 

The next two theorems have to do with summing integers. The 
numbers used in counting, 

1, 2, 3, 4, 5, • . • 

are called integers or natural numbers. 



Averages 


31 


N 

x> = 

x=l 


Theorem IV. The sum of the first N integers is 

N(N + 1) 

2 

In symbols: f> = —— l 

S=i 2 

This result follows from the fact that the integers form an arithmetic 
progression. 

Theorem V. The sum of the squares of the first N integers is 

N(N_ + 1X2 N + 1) 

In symbols: gy . W +1)0?* + 1). 

x—i 6 


In symbols: 


N 

^X 2 = 


Proof : Let us take the identity x z — (x — l) 3 = 3x 2 — Sx + 1, 
and sum each side for x = 1 to N. Thus, 

I> 3 - (* - l) 3 ] = E[3x 2 - 3a: + 1]. 

X=1 X “1 

Applying Theorems I-III to the right side we have 

2> 3 - (* - l) 3 ] = 3l> 2 - + N. 

x =1 X=1 X =1 

Performing the indicated sum in the left member, we have 


whose sum is N 


N 3 — (N -NO 3 


Therefore N* = 3^> 2 - 3^> + N. 

Hence, using Theorem IV and simplifying, 

^ 2 _ 2V 3 + 3V(V + 1) — 2JV 

hi 6 

N(N + 1)(2N + 1) 
a 

X=l O 


N 

= 

x = l 


whence 


32 


Mathematics of Statistics 


3. Arithmetic Mean. The arithmetic mean of a set of variates is 
equal to the sum of the variates divided by their number. We are 
thinking now of a set of ungrouped variates, like that of Table 1. If 
we use the symbol x to represent the arithmetic mean of the N 
variates x 2 , x 3 , • • •, Xn, then 

x = — (xi + X 2 + Xz + • • • + Xn ) 9 


or using the more compact notation of the preceding section, we have 


( 1 ) 


1 N 

x = - y 

Nt=i 


As an illustration, it is easily verified that for the set of grades given 
in Table 1, 


7267 

x =- 

100 


72.67. 


Computing the mean 1 strictly according to equation (1) may be called 
the serial method to distinguish it from other methods which will be 
presented. This definition is applicable when N is so small that a 
grouping of the variates into a frequency distribution is not feasible. 

If x refers to the integers from 1 to A their mean is 

1 N 

(la) x = t.2> 

N x =i 

4. Weighted Arithmetic Mean. It will be noticed that several of 
the grades given in Table 1 are alike. For example, 80 occurs seven 
times. It should be evident that the same result would be found for 
the mean if, instead of summing the individual values, each value was 
first multiplied by the frequency with which it occurs and all such 
products were then added. In general, if the values x h x 2 , • • •, x k 
occur with corresponding frequencies /i, / 2 , • • •, /*, respectively, 
where /i + + • * • + fh = N, it follows that 

_ Xifi + x 2 f 2 + • — + x k f k 
fl + /2 + * • * + /* ^ 

1 When there is no ambiguity, the arithmetic mean is often referred to as the 
mean. 



Averages 


33 


or, in shorter notation 

4 ^ Jq 

(2) x = -I/* where 2V = £/<• 

When obtained in this way, x is generally called a weighted arith¬ 
metic mean. The term originated in experimental science where 
some readings which have been made under more favorable conditions 
are “ weighted ” according to their reliability or importance. When 
the weights have been chosen, they become, essentially, frequencies. 

If the x’s are added individually, the f s become unity, and equa¬ 
tion (2) reduces to (1). The student should notice that, for the 

k N 

same data, Yhf&i is numerically equal to He should also 

i i 

observe that N refers to the number of variates in the set (some of 
which may be alike), whereas k refers to the number of different values 
of x in the set and hence to the number of products of the form xffi 
where /*• is the number of times Xi occurs. In the following example, 
N = 8 and k = 4. 


Example. For the values 6, 8, 7, 6, 5, 7, 6, 5, 

8 

Xi = X\ + Xi + Z 3 + X 4 + + x% + X 7 +X 8 = 6 + 84-7 + 6 + 5 + 7-1-6+5 

i = 1 

= 50 
4 

Jifai = fixi + fiX 2 +/ 3 X 3 +/ 4 X 4 = 2-5 + 3-6 + 2*7 + 1-8 = 50. 

i = 1 

By either method, x = 50/8 = 6.25. 


1. Write in expanded form: 

(a) * 

4=1 


Exercises 


») 

4=1 


2. Write in expanded form: 


(<*) TJH 

1 


m+«2 

(6) E /.•; 

t*=m+l 


(c) 

»=1 


m m +na 

(c) E*</i + E 

4 = 1 4 =m +1 


3. Express 2(c) as a single summation, if ni + n 2 = lb. 

4. Write in the abbreviated form, using £ : 

(a) Xifi + 2^2 + • • • + Xkfk* 



34 


Mathematics of Statistics 


( 6 ) (xi — 2 )/, + (x 2 — x)fi + • • • + (xic — x)fk. 

(c) ~ [(n - 2)Vi + (*,-*)«/, + ••• + (x* - 2) 2 /*]. 

5. Prove: 

k k k 

(a) £(x< + 1 )*/< = I>i 2 /< + 2£x</i + 2V. 

1 11 

Q>) ^xix — 1 )p = — 1 )p. 

x~0 x =2 

6. Compute the value of 1(c) for the example In §4, using the following form: 

f% (x% X ) ( X{ x)/i 

5 2 -1.25 -2.50 

6 3 

7 2? ? 

8 1 _ 

Il(Xi - X)fi = ? 

N / N \ / N \ 

7. Distinguish between and fy * Write in ex P and ed form. 

8. (a) Express in notation: Each different variate is multiplied by its own 

/ and the sum of the results is divided by N, 

(b) Give word statements of the expressions in Exercise 4. 

9. Using the identity 

x 2 - Or - l) 2 - 2x - 1 

derive the result 

'■£ NW + *> 

x-l 2 

by a method analogous to the proof of Theorem V. 

10. (a) Express in abbreviated notation: The sum of the squares of the x’s 
divided by the square of their sum. 

(b) If x refers to the integers from 1 to N, evaluate your answer to (a) in 
terms of N. 

(c) Show that the mean of the first N integers is ( N + 1)/2. 

5. Arithmetic Mean from Frequency Table. The variates in each 
class interval of a frequency distribution are assumed to have the 
value of the class mark for that interval. Therefore, we may use 
formula (2) to find the mean of a frequency distribution. In this 
case, Xi represents the mid-value of the ith class interval, fi the corre¬ 
sponding frequency, and k the number of intervals; i running from 1 
to k . The method of applying (2) is illustrated in Table 14 from the 
data of Table 2. 





Averages 


35 


In this connection it is interesting to note that our result here 
differs very little from the true value 72.67 and therefore our assump¬ 
tion that all values in a given class may be taken as the class mark 
seems to cause little error in the result obtained for the mean. This 
can be proved mathematically (under certain assumptions) and will 
be referred to later. 


Table 14 



Frequency 

f 

Product 

fa 

2 

69.0 

3 

133.5 

11 

599.5 

20 

1290.0 

32 

2384.0 


25 

2112.5 


7 

661.5 


o 

o 

y—i 

II 

ft 

J^fa - 7^0.0 


7250 

2 - = 72.50. 

[100 

If we denote the class interval by c then it is obvious that c = 10 in Table 14. 


6. Translation of Axes; Deviations. It is frequently useful to 
employ the methods and results of geometry in connection with the 
problems of statistics. Foremost among these methods is the repre¬ 
sentation of numbers by points on a line; an origin and a unit of 
measure having been chosen, a coordinate is assigned to each point on 
the line. When a frequency distribution is represented by a graph, 
we have seen in Chapter II that the variate values are used as abscis¬ 
sas or measurements along the x-axis. The mean is therefore the 
point on the x-axis whose coordinates are (x, 0). Its position may 
be emphasized by drawing a vertical line through this point, but it is 
the horizontal distance of the point from the origin and not the 
vertical line which represents graphically the mean. 

In discussing the variates we may often work with smaller numbers 
by changing the origin of reference. If new axes, x'y', are taken 
parallel to the old axes, xy , with positive directions preserved, the 
axes are said to be translated from one position to the other. A trans- 











36 Mathematics of Statistics 

lation of axes corresponds to a transformation of coordinates.^ Thus if 
we let 

x' = X - Xo, y' = y — yo 

the origin is translated to the point (£ 0 , yo). Since the variates are 
denoted by x we are concerned here only with the transformation 
x' — x — xo which translates the origin to the point (x 0 , 0). The 
variates referred to a new origin are often called deviations . In 
particular if we translate the origin to the mean by letting 

x r = x — x, 

then for a frequency distribution the deviations are the values 
obtained by subtracting x from each of the class marks. Thus, 

Xi = X\ — x 

X 2 = x 2 — x 

Xk = Xk — X. 

The units of measurement remain unchanged. Figure 5 shows the 
two systems when the axes are translated to ( x, 0). 



Obviously, any variates that are larger than x will be positive in terms 
of x' and any variates smaller than x will be negative in terms of x'. 

7. Properties of x. There are two important properties of x 
which may be stated in the following theorems: 

Theorem VI. The algebraic 1 sum of the deviations of all the variates 
from their arithmetic mean is zero . 

1 That is, taking account of signs. Some of the deviations will be positive and 
some negative. 



Averages 


37 


Proof: Let x f represent a deviation from the mean. Multiplying 
each different deviation by the number of times it occurs and adding 
these products we have, 

k k 

~ X ) 

1 1 

k k 

= ^fiXi — by Theorem I 

i i 

k k 

= Hf&i “ by Theorem II. 

i i 

k k 

Recalling from (2) that ^ Zf< x * ~ #35, and that T'Ji = N, we have 
i i 

(3) Zftei — %)= Nx — xN = 0. 

i 

Theorem VII. If the variates are referred to a new origin Xq and 
expressed in units of c by means of the transformation 

(4) u =-, (c^O), 

c 

then the old mean , x, is related to the new mean , u, by the following 
formula: 

( 5 ) x == cu + Xq . 

Proof: From (4), 

(4a) x = cu + Xq 

and substitution of this value for x in the definition (2) gives 

1 k 

X = jj 52 fi(cUi + Xo). 

By Theorems I and II this equals 



But the first of these expressions is, by definition, c times the mean 
value of u, and the second is, from (2), simply x 0 . Therefore 

X = CU + Xq. 

This is an important relation and its derivation should be mastered. 



38 


Mathematics of Statistics 


Corollary. If the mean of the deviations of the variates from any 
arbitrary number , x 0 > is found and added algebraically to Xo, the result is 
the mean x. In symbols, 

1 k 

(6) x = —'£ / fi(xi - x 0 ) + xo. 

I* l 

The proof follows from (4) and (5). 

In (5) and (6), x 0 may be regarded as a provisional mean, and the 
first term in the right members may be thought of as the correction 
to be added algebraically to the provisional mean in order to get the 
true mean. 

8. Short Methods of Computing x . In practice it is seldom desir¬ 
able to use formula (2) as applied in Table 14 in computing the mean 
of a frequency distribution. Theorem VII provides a much easier 
method. 

Case I (class intervals equal). If the class marks are equispaced, 
let c equal the class interval and choose x 0 as one of the class marks, 
usually the one opposite the largest frequency. From (4), x 0 becomes 
the origin of u, because when x — x 0 , u = 0. 

The method of using (5) is illustrated in Table 15. Here c = 10 
and we choose x 0 = 74.5, so (4) becomes 

x - 74.5 

w = ^o— 

Substituting the given values of x in this relation we get the values in 
the u column. So in running the fu column, small values of u are 
multipliers of the larger values of /. 


Table 15 — Mean of 100 Grades Using Class Interval as Unit 


X 

u 

/ 

fu 

34.5 

-4 

2 

- 8 

44.5 

—3 

3 

- 9 

54.5 

-2 

11 

-22 

64.5 

-1 

20 

-20 

74.5 

0 

32 

0 

84.5 

1 

25 

25 

94.5 

2 

7 

14 

Totals 


100 

-20 





Averages 


39 


Then 


so from (5), 



-20 

100 


= -. 2 , 


x = 10(—.2) + 74.5 - 72.5%. 


It should be evident that the final value obtained for the mean is 
independent of the choice of the arbitrary value x 0 . This choice is 
only a rough guess and it is really immaterial which of the given 
values is selected as x 0 , except that the nearer it is to the mean the 
lighter will be the calculations to follow. A check on the arithmetic 
may, therefore, be effected by selecting a different provisional mean. 

This indirect method is sometimes called coding because the vari¬ 
ates are coded to another scale in which it is easier to compute the 
mean. Formula (5) is the relation, then, for transforming the mean 
from one scale to another. 

k 

When N = ]£/* IS l ar S e an( ^ x ’ 8 are e Quispaced the indirect 

i=l 

method of computing the mean should always be used. 

Case II (class intervals unequal ). Occasionally a frequency dis¬ 
tribution is encountered in which the variates are not equispaced; 
it is then usually best to take c = 1 (unless the x’s have a common 
factor c) and be content with whatever simplification results from a 
suitable choice of x 0 . This is equivalent to using the above corollary. 

In Table 16, we choose x 0 = 200 and are thus able to simplify the 
work a little. (See page 40.) 

9. Geometric Explanation. Let us consider further the relation 
between the variables x and u y defined by the expression 


A geometric explanation will be helpful. 

Graphically, the x values are distances along the x-axis measured 
from zero as origin. Likewise x 0 is some point on the x-axis at a 
distance of x 0 units from zero. If now the points representing the x 
values are measured from x 0 as origin they are denoted by x — x 0 . 
(See Figure 6.) Thus if x 0 = 24, a value which is 36 with reference 
to the origin of x will be 12 with reference to x 0 ; likewise a value 
x = 18 becomes x — x 0 — —6 when referred to x 0 as origin. It 



40 


Mathematics of Statistics 


Table 16 


X 

f 

u 

uf 

106.12 

7 

- 93.88 

- 657.16 

191.83 

14 

- 8.17 

- 114.38 

246.48 

32 

46.48 

1487.36 

283.63 

49 

83.63 

4097.87 

257.65 

55 

57.65 

3170.75 

294.51 

54 

94.51 

5103.54 

222.53 

35 

22.53 

788.55 

71.43 

14 

-128.57 

-1799.98 

Totals 

260 


12076.55 


u = JjUfiUi = “ 200) 


12076.55 

260 


46.448 


5 = w + Xo = 246.45. 


should be noted that x — xq is in the same units as x. Thus if a; is in 
inches, x — x 0 will also be in inches. But (x — x 0 )/12 would then 

be in feet. Instead of dividing 
by 12 suppose we divide by c . 
Then (x — x 0 )/c will be in units 
of c whatever c may be. It is 
convenient to denote the re¬ 
sulting values by a different 
letter, say u. Therefore the 
numerator of (4) changes the 
origin of reference but does 
not affect the scale of meas¬ 
urement. The denominator 
Fig. 6 changes the scale, there being 

c of the x units in one of the 
u units. Relation (4) has this generalized meaning apart from 
statistics. Mathematical notation is applicable to many different 
fields of knowledge. 

When (4) is applied to a frequency distribution it is convenient to 
select xo as one of the mid-x values and to take c as the width of the 




Averages 


41 


class intervals. Under Case I, the mean is found with reference to 
x 0 and in units of c. This is the mean, u, of the numbers representing 
the various class intervals weighted with the corresponding frequen¬ 
cies. After this mean is computed it may be converted back into 
units of x by multiplying by c, and then referred to the origin of x by 
adding x 0 . (See Figure 7.) Hence 
we have x = cu + x 0 . Thus we 
arrive at the same result as that 
obtained algebraically. 

If we had denoted the variates by 
y we could have used the relation 

v — ¥. -^2 Fig. 7 — If xo < x, cu is positive; if 

C x 0 > x, cu is negative. 

corresponding to (4). Geometrically, this would mean a change of 
units and a translation of origin in the ^/-direction. The relation 
corresponding to (5) would then be 

y = cv + 2/o 

where v = —£/><• 

As the short-cut method is an important one, another illus¬ 
tration is given in Table 17 (based on Table 4). Here we take 
u = (x - 2.745)/0.5 = 2(x - 2.745). 

10. Mean of Means. So far we have used subscripts to distin¬ 
guish between the variates within a set: x h x 2) • • ♦, x N . By this time 
the student should be thinking easily in this notation so we may now 
state an additional use of subscripts. Instead of using x and y to 
distinguish between two sets of variates we may use X\ and x 2 . Then 
to distinguish the variates within a set we would add a second sub¬ 
script, so for the xi set the variates are 

#11> ^12) Xis, * • * , X\n\ 

and for the x 2 set the variates are 

•E21j *^22) X 2 3y * * * , X 2 ri2 • 

These are read “ x two one,” etc., not “ x twenty-one,” etc. In the 
notation dealing with one set, x was a variable but x h x 2 , etc., were 
constants. Now x\ and x 2 are variables and xi h x 12> • • • ,£ 21 , 2 : 22 , *•• , 
etc., are constants. Thus x\ and x 2 may denote the grades of two 



42 


Mathematics of Statistics 


Table 17 — Computation of Mean Monthly Rainfall at Iowa City 

1890-1925 


X 

/ 

u 

Ju 

0.245 

23 

— 5 

—115 

0.745 

42 

-4 

-168 

1.245 

58 

-3 

-174 

1.745 

62 

—2 

-124 

2.245 

49 

-1 

- 49 

2.745<~ xo 

47 

0 

0 

3.245 

32 

1 

32 

3.745 

27 

2 

54 

4.245 

18 

3 

54 

4.745 

15 

4 

60 

5.245 

14 

5 

70 

5.745 

7 

6 

42 

6.245 

10 

7 

70 

6.745 

5 

8 

40 

7.245 

6 

9 

54 

7.745 

5 

10 

50 

8.245 

3 

11 

33 

8.745 

2 

12 

24 

9.245 

5 

13 

65 

9.745 

0 

14 

0 

10.245 

1 

15 

15 

10.745 

1 

16 

16 

Totals 

432 


49 


£ = 2.745 + 

(0.5) (49) 



432 



= 2.802 inches. 



sections of mathematics in which there are n\ and n 2 students respec¬ 
tively. Then the mean of the first set is 


(«) 


Xi = — 

nn=i 


and the mean of the second set is 


1 »* 

(&) $2 = — 'E'Xu. 

712 t=l 

We will now state a useful theorem. 




Averages 


43 


Theorem VIIL If the mean of a set of n\ variates is x\ and the mean 
of another set of n 2 variates is x 2 , the mean x of the combined sets is 


( 7 ) 


TiiXi n 2 x 2 
N 


where N — hi + n 2 . 

Proof: It is obvious from equations (a) and (6) that 


«1 w« 

(c) nix i + n 2 Z 2 = + ^X 2 <. 



If x is allowed to stand for X\ and x 2 in succession as shown in the 

m +m 

preceding table then the right member of (c) may be written ^ x * 

i 

which denotes the sum of all the variates when they are combined 
into one set. If this latter sum is divided by the total number of vari¬ 
ates N the result is, by definition, their mean. Hence 

ni nt ni -Hu 

»A + nA 

n\ + n 2 n\ “j- n 2 N 

We may express (7) in more compact notation as follows: 

l 2 2 

2 2 n &i> N = 23 n <- 

•iv.-i <-i 



44 


Mathematics of Statistics 


This form lends itself to a generalization for k sets so we have the 
following theorem. 

Theorem IX. The mean of a set of N variates which is composed of k 
subsets is 

1 & ' 

( 8 ) x = TT^n<Xi 

1 * 1=1 

where Xi is the mean and w* is the frequency in the ith subset and 

N=Z "<■ 

Corollary. If ni = n is the same for all the sets , then N — kn and 

(8) reduces to 

(9) x = \ X&. 

k i 

Exercises 

1. (a) Use (1), §3, to find the mean of the following numbers: 18, 42, 23, 16, 

103, 61, 49, 95, 113, 10. 

(6) For the numbers in (a) verify that the sum of their deviations from their 
mean is zero. What theorem does this exercise illustrate? 

2. Find the deviations of the numbers in Ex. 1 from 50 and verify that the mean 

of these deviations added algebraically to 50 gives the mean of the numbers 
themselves. 

3. Prove: The sum of the deviations of the variates from their mean is zero. 

4. Derive the relation x = cu + x 0 . 

5. Find the arithmetic mean of the weights of 1000 students given in Table 12. 

Use (5). A ns. 138.65 lbs. 

6. Find the mean monthly rainfall at Des Moines from 1890 to 1925, using the 

frequency distribution which you previously made. Ans, 2.55 inches. 

7. Find the mean of the distribution of discrete variates given in Table 11. 

8. Prove the following theorem: The mean of a set of variates is unchanged if 

each variate is replaced by the mean of of all the variates. 

9. (a) Prove expressions (8) and (9). 

(6) The mean grade of one class of 20 students is 76% and of another class of 
15 students is 80 %. Find the mean of the two classes. 

10 . The record of Freshmen scholastic averages for a semester at a certain uni¬ 
versity were given as follows: 



Ui 

Xi 

Men 

501 

3.550 

Women 

356 

3.639 


Find the mean grade for the entire class. 


45 


Averages 


11 . Assume that the following fictitious data represent the earnings per week of a 
certain type of machine shop labor in Illinois establishments: 


Wage Group 

Frequency 

2 

O 

under $10.0 

50 

10.0 

20.0 

150 

20.0 

30.0 

400 

30.0 

40.0 

200 

40.0 

50.0 

160 

k * 

* 

* 

60.0 

80.0 

40 

Total 

* Class omitted. 

1000 

Note the different interval in the last 


The average earnings per week for this same type of labor in all other states of 
the United States where 9000 men are employed not counting those in Illinois, are 
$30.00 per week. 

Compute the arithmetic mean wage (a) for Illinois, ( b ) for the entire United 
States. 

Recompute the mean wage for Illinois in such a manner, as to check, in the 
quickest and surest way, the accuracy of the result found in (a) above. 

12 . Find the mean of the following distribution: 


X 

/ 

47.5 

7 

48.1 

17 

45.9 

46 

44.0 

44 

40.7 

54 

41.6 

43 

38.0 

35 

33.2 

14 


11. The Mode. That value of the variable which occurs most 
frequently is called the mode . It is the most probable value. Its 
chief service is in characterizing a type and it is the kind of average 
meant by such a phrase as the “ average man.” There is some diffi¬ 
culty in giving a precise definition of the mode without more advanced 
mathematics. However, we may say that for a given grouping an 
approximate value, which we will call the empirical mode, is given by 
the class mark having the largest frequency. 1 Thus, in Table 17 
the empirical mode is 1.745 inches. 


1 Another method of computing the mode will be given in a later section. 


46 


Mathematics of Statistics 


12. The Median. Instead of finding the mean, suppose the N 
variates are arranged in the order of their magnitude. The median is 
defined as the value which is greater than half the variates and less 
than the other half. A more precise definition is as follows: 

Let xi, X 2 , • • • , xn be a set of real numbers, which may or may not 
be all different and suppose they are arranged in order of magnitude 
so that 

Xi ^ X2 ^ xz g • • • S %n- 


Whenever N is odd, N = 2k — 1, the median is x k , the middle one of 
the x’s. If N is even, N = 2k, the median is not uniquely defined 
unless Xk = x k + 1, in which case the median is this common value. 
Otherwise, the definition is satisfied by any value of x belonging to 
the interval 

XkSx ^ Zfc+1, 

and the median is to this extent indeterminate. In this case it is 
conventional to take 


h(x k + Xk+i) 

as the median. 

Example. Find the median of the following set of numbers: 10, 6, 5,25,15,18, 
20 . 

Arranging them in order of magnitude we find the median to be 15 (the mean is 
14.14). If we add another value, 37, to make N even, the median is 4(15 + 18) = 
16.5 (the mean is 17). 

13. Median of a Frequency Distribution. Case /. For a fre¬ 
quency distribution of continuous variates, the median is defined as 
follows: 

Definition: The median is the value of x for which cum f = N/2. 
Given such a frequency distribution we may therefore find its 
median by forming a cumulative frequency table and interpolating in 
the end-x column for the value of x corresponding to N/2. 

The method should be clear from the following illustration. 








Averages 


47 


Find the median for the data of Table 2. 

Interval 

/ 

End-x 

Cumf 


• 

29.5 

0 

30-39 

2 

39.5 

2 

40-49 

3 

49.5 

5 

50-59 

11 

59.5 

16 

60-69 

20 

69.5 

36 

70-79 

32 

Md 

<-50 



79.5 

68 

80-89 

25 

89.5 

93 

90-99 

7 

99.5 

100 


Here, Nf 2 = 50. This value of cum f corresponds to a value of x 
in the interval 69.5-79.5. Therefore the median is 69.5 plus a frac¬ 
tion of the distance from 69.5 to 79.5. Thus, 


Di\ 


End-x 

Cumf 

, f 69.5 
'[_Mcdian 

79.5 

36 "L 

50_f 2 

68 

d 2 


Assuming that the items in any class interval are uniformly distrib¬ 
uted over that interval, it follows that the partial differences are 
proportional to the total differences: di/Di = d 2 /D 2 . That is, 

Median — 69.5 50 — 36 

79.5 - 69.5 = 68 - 36 

whence, 

Median = 69.5 + 10 

= 69.5 + 4.4 = 73.9. 



This is called “ straight line interpolation ” or “ interpolation by 
proportional parts.” The reason for these names is made clear in the 
following diagram. 


48 


Mathematics of Statistics 



A ABC 
AB 
“ AE 


x = AB 


^ A AED 
= BC 
~ ED 
_ AE BC 
ED 

_ 10(50 - 36) 
68 — 36 



.\ Md = 69.5 + x = 73.9. 


Case II. In the case of a set of discrete variates there may be no 
value in the set such that the number of variates which are larger than 
it is equal to the number less than it. Thus in Table 11 the values of x 
are integers and 42% of the throws yielded 6 or fewer successes and 
58% yielded 7 or more successes. Of course a formal application of 
the definition given in Case I will give a value of x for which cum f is 
N/2. The difficulty is not so much in the interpretation of the frac¬ 
tional result because the same objection could be cited against the 
mean. But the real difficulty lies in explaining interpolation in a 
discontinuous function. We cannot assume that the given frequen¬ 
cies are distributed over the interval from one value of x to the next. 
Perhaps the best we can do in such cases is to make a statement simi- 





Averages 49 

lar to the one above for Table 11. At least such a statement serves 
to summarize the situation without artificiality. 

14. Graphical Interpretation of Mean, Median, and Mode. The 
mean corresponds to the abscissa of the point known in mechanics as 
the centroid of area. If a thin, homogeneous plate of metal cut in 
the shape of a histogram is supported loosely on a horizontal axis 
through its centroid, the plate will have no tendency to rotate, what¬ 
ever horizontal direction this axis may assume. 

The median of a frequency distribution is the abscissa of a point 
through which a vertical line will divide the total area of the histo¬ 
gram into two equal parts. 

If a distribution could be represented by a smooth curve, then the 
mode is the abscissa of the highest point on the curve. 



Figure 9 shows the position of the three averages in a moderately 
skew distribution. If the distribution were perfectly symmetrical 
then all three averages would coincide. 

15. Discussion. The student primarily interested in the use of 
these averages in practical statistics might reasonably inquire, 
“ Which of the three averages mentioned should be used in a given 
problem? ” The answer depends upon certain properties peculiar to 
each average and upon the nature of the data to be averaged. 

In most cases the mean is a distinctly superior average. It is 
rigorously defined, easily computed, and is most tractable in theoreti¬ 
cal discussions. 

When the median differs considerably from the mean it is likely 
that the median is the more typical value. The advantage of the 
median over the mean is recognized in at least three situations: 

(a) When occasional and unexpected values occur at the ends of 
the distribution. In such cases the mean would tend to distort the 
true representation of the typical value, being unduly influenced by 
the exceptional values. 


50 


Mathematics of Statistics 


(b) When the data are presented in a table left open at one or both 
ends. For example, suppose the registrar’s office of the University 
reports the following distribution of grades as given in all departments 
for a semester: 


Below 60 

60-69 

70-79 

80-89 

90-100 

215 

1060 

2217 

1242 

506 


A cum f table may be formed and hence the median found without 
any more information about the values less than 60. 

(c) When the observations cannot be measured numerically but 
can be ordered. 

The mode is best adapted to situations where the word “ usual ” 
would be appropriate. Unless a large number of items are con¬ 
sidered the mode can have little practical meaning. It is the appro¬ 
priate average in certain questions of marketing because manufactur¬ 
ers are interested in the type or quality which is usually in demand. 
Or again, in an investigation concerning wages and cost of living, the 
mode would reflect the average situation. Also, in a mathematical 
treatment of frequency curves the concept of the mode is very useful. 

Sometimes a distribution has more than one mode, although this is 
usually due to heterogeneous material. In this course we will be 
concerned only with unimodal distributions. 

For a more complete treatment of the applicability of these three 
averages, the student is referred to the following books: 

1. Statistical Averages — Zizek. 

2. Theory of Statistics — Yule and Kendall, Ch. VII. 

3. The Mathematics of Statistics — Burgess, Ch. V. 

4. Mathematical Statistics — Camp, p. 40. 

5. Principles and Methods of Statistics — Chaddock, Chs. VI-VIIL 

Exercises 

1. What is the empirical mode in Table 10? Table 12? 

2. Explain why the median is found from interpolating in the end-x column 

and not the mid-x column. 

3. Read one or more of the references in §15 and write an essay on the ad¬ 

vantages and limitations of the mean, median, and mode. 

4. Find the median IQ for the data in Table 7. 

5. Find the median for the data in Table 9. 




Averages 


51 


16. Geometric Mean. The geometric mean of a series of N 
positive values is the Nth root of their product. Thus, the geometric 
mean (G.M.) of two values is the square root of their product, of three 
values the cube root of their product, and in general for the N values 
Vu V*} • • * 7 1 In, 

i 

(10) G.M. = [jh • 1/2 • y* • • • !Jn] n . 

Equation (10) lends itself to the use of logarithms and they greatly 
facilitate its computation. From (10) we have 

(11) log G.M. = [log yi + log i /2 + • • • + log y N ]. 

Therefore the arithmetic mean of the logarithms of a series of values 
is the same as the logarithm of the geometric mean of the values 
themselves. 

Examples: Find the geometric mean of 

(а) 3,6,12,24,48. 

Solution: 

G.M. = [(3 8 )(2 10 )]^ 5 « (3)(2 2 ) = 12. 

(б) 7.96,13.82,22.95,35.34. 

Solution: 

log 7.96 - 0.90091 
log 13.82 = 1.14051 
log 22.95 - 1.36078 
log 35.34 = 1.54827 
4 j4.95047 
log G.M. « 1.23762 
G.M. - 17.28 

The geometric mean is the appropriate average when the data are 
limited at one end of the range and unlimited at the other, and there 
tends to be a constant rate of change from one y value to the next. 
This is characteristic of values which tend to form a geometric progres¬ 
sion, i.e., which tend to follow the simple exponential law 

( 12 ) y = ar x . 

The student will recall from algebra that a geometric progression can 
be put in the form 


X 

0 1 2 • • • x 

y 

a ar ar 2 • • • ar x 




52 


Mathematics of Statistics 


The value of any term in the y series is a function of the exponent of r 
since a and r are constants. The functional relationship is therefore 
represented by (12). 

The growth of many quantities in nature follows this law and it is 
spmetimes called the law of growth, although “ exponential function ” 
would be more accurate. With x referring to time, y may represent, 
for example, the population of a city, the enrollment of a school, the 
weight of a quantity, or the number of bacteria in a culture. The 
accumulated amount S , of P dollars invested at i rate of interest, 
compounded periodically for n periods also takes the form of (12), 
namely, 

S = P(1 + i) n , 


where r is now (1 + i), a is P, and n and S are the variables 
corresponding to x and y. 

Thus, if $1000 increased at compound interest to $2150 in 31 years, 
$1000 $2150 

i-1-1-f-1 

0 1 2 30 31 


the geometric average rate at which 
from the equation 

r 31 = (1 + *) 31 = 


the money increased 

2150 

1000 


is found 


1 + i = (2.15) 1/31 
= 1.025 
i = 2*%. 


Since there was an increase of = 115%, the arithmetic average 
115 

would be = 3.7% which is also the simple interest rate. 

If y in equation (12) represents population, and we are given two 
values of y corresponding to two dates N years apart, the geometric 
mean enables us to find a fairer estimate of the value of y at the mid 
date than would be given by any other average. For example, 
suppose we are given that the population of a city was 2500 in 1920 
and 5000 in 1930. We wish to estimate the population in* 1925 and 
to find the average annual rate of increase. If we are given no other 
information, our best estimate for 1925 is given by 


G.M. = (y x • 2 / 2 ) 1/2 = (2500 X 5000) 1 ' 2 = 3535. 



Averages 


53 


The average annual rate of increase is obtained by solving (12) for r as 
follows: 

5000 - 2500r 10 
2 = r 10 . 


Hence r — V2 = 1.0718 = 107.18%, so that the average annual 
rate of increase is 7.18%. It is now possible to estimate the popula¬ 
tion for any intermediate year. Thus, for 1928, we have from (12): 

y = 2500(1.0718) 8 = 4353. 


The geometric mean is also used in economics in averaging “ index 
numbers ” which are essentially the ratios of prices of commodities at 
one date to their prices at another date. In general it is the appropri¬ 
ate average when emphasis is on the rate or percentage of change 
rather than the amount. 

17. Harmonic Mean. Another average which has long been 
known and which is required in certain problems is the harmonic mean 
(H.M.). For the N positive values x h x 2} • • • x N , it is defined as the 
reciprocal of the arithmetic mean of the reciprocals of the values. 
In symbols, 


(13) 


H.M. 


1/1 1 

+ 

N\xi x 2 



This measure is used in averaging ratios, such as rates and prices, when 
certain conditions are agreed upon. 

In the case of time rates, we have ratios between two quantities 
one of which is in units of time, which we will denote by t> and the 
other is in units of some element like distance or accomplishment or 
temperature, etc. Denote this second element, different from time, 
by d. Then we make the following observations: 

(а) A rate may be stated either in the form d/t or in the form t/d 
Thus, a car which travels at the rate of 30 miles per hour may also be 
said to travel at the rate of 2 minutes per mile. In this illustration 
the second form is not the usual way of expressing the rate, but there 
are cases in which the form t/d is usual. When we say a man takes 
10 seconds to run 100 yards we are expressing his rate in time per unit 
of distance (t/d). 

(б) In averaging rates one should first decide whether d or t should 
properly be the basic or “ fixed ” element in the discussion. Occa- 




54 


Mathematics of Statistics 


sionally there is a difference of opinion about which element should 
most appropriately be regarded as fixed. For example, suppose a 
class of students has been given 15 minutes in which to work as many 
as they can of a given list of problems, and the number of problems 
worked correctly by each student recorded. Some educational 
statisticians would say that time should be the fixed element here and 
that number of problems solved (in a unit of time) should be the vari¬ 
able. Others would say that the number of minutes ( t ) a student 
required to work one problem is t the proper variable and that 
a problem ( d ) should be regarded as the fixed element in the dis¬ 
cussion. 

In one case the rates are equally weighted in the sense of time 
and in the other case they are equally weighted in the sense of the 
element d. 

(c) The harmonic mean of the rates expressed in the form d/t gives 
the same result as the arithmetic mean of the same rates expressed in 
the form t/d . This is evident from equation (13) if it is written in the 
form, 

H.M. N^Xi 


and from the fact that rates in one form are merely the reciprocals of 
the same rates in the other form. 

As an illustration, let us consider three cars: 


I 


A travels at the rate of 15 miles per hour (i mile per minute), 
B travels at the rate of 20 miles per hour (J mile per minute), 
C travels at the rate of 30 miles per hour (J mile per minute). 


But their rates could just as well have been stated as 


II 


A travels at the rate of 4 minutes to the mile, 
B travels at the rate of 3 minutes to the mile, 
C travels at the rate of 2 minutes to the mile. 


The harmonic mean of the rates as stated in I is 20 miles per hour; 
i.e., | of a mile per minute, and the arithmetic mean of the rates as 
stated in II is 3 minutes per mile or again, 20 miles per hour. 
(Verify.) 

The third observation, i.e. y (c) above, suggests the following discus¬ 
sion. The arithmetic mean of the rates in I is 21f m.p.h. and this is 
the harmonic mean of the rates as stated in II. 




Averages 


55 


The question arises, which is the correct average, 20 m.p.h. or 21§ 
m.p.h.? The problem is indeterminate until it be agreed whether 
time or distance is the fixed element. The correct average will differ 
according to the condition agreed upon. This will be made clear in 
the following analysis. 

Case I. Let 


di 



denote the ith rate, i = 1, 2, • • • n. Then the average rate is 

D = total distance _ kxi + t 2 x 2 + • • - + t n x n 
T = total time t\ ~b t 2 + * • • + in 

Condition 1. Let distance be the fixed element, i.e., let d be con¬ 
stant. Then d = t&i, and U = d/xi. Therefore, the expression for 
average rate becomes 

^JtjXi _ jid _1 

z- <*z- -z- 

Xi Xi n Xi 

which is the harmonic mean. 

Condition 2. Suppose t is the fixed element. Then 
becomes since t is a constant, and ^2t becomes nt. Hence, we 
have for the average rate, 

D _ i^jXj _ 1. 

T nt n 

which is the arithmetic mean. 

Case II. Let Xi = U/di denote the ith rate. Then the average 
rate is 

T = total time _ 

D = total distance T/k 

Condition 1. Suppose d is the fixed element. Then U ~ dxi and 
d = U/xi. Hence, we have 

T = = 

D nd n 





56 Mathematics of Statistics 


Condition 2. 
rate is 


Let t be fixed. Then di = t/xi and the average 


T 

D 


nt 




We therefore state the following rules for averaging rates: 

Rule 1. The harmonic mean is used whenever the fixed element is d 
and the rates are expressed in the form d/t , or when the fixed element 
is t and the rates are expressed in the form i/d . 

Rule 2. The arithmetic mean should be used when the fixed ele¬ 
ment is t and the rates are expressed in the form d/t, or when the fixed 
element is d and the rates are expressed in the form t/d. 

In the case of prices, which are of course ratios, a similar discussion 
holds except that now the unit of time is to be replaced by a unit of 
money. Therefore, prices are ratios between two quantities, one of 
which is in units of money and the other in units of some commodity 
or service. They may be stated as so much money per unit of com¬ 
modity (m/c), or as so many units of commodity per dollar (c/m). 
Thus, if 100 bushels of wheat are exchanged for 75 dollars of gold, the 
price of the wheat in terms of gold is 75 4- 100, or three-fourths of a 
dollar of gold per bushel of wheat. Contrariwise, the price of gold 
in terms of wheat is 100 4- 75, or one and one-third bushels of wheat 
per dollar of gold. Thus, there are always two prices in any exchange. 

The correct average will depend upon how the prices are stated and 
upon whether a unit of the commodity (or service) or a unit of money 
is the fixed element. 

The following reference is recommended, The Nature and Use of the 
Harmonic Mean — Ferger, Journal American Statistical Association, 
vol. 26 (1931), pp. 36-40. 


Examples 

1. In a certain factory a unit of work is completed by A in four minutes, by B 
in five minutes, by C in six minutes, by D in ten minutes, and by E in 
twelve minutes. What is their average rate of working? At this rate 
how many units will they complete in a six-hour day? 

Solution. The rates are expressed in the form t/d but it would seem appro¬ 
priate to regard t as the basic or fixed element since output per unit of 
time appears to be the important consideration here. So by Rule 1, 

H,M ' = J+i+4+A + A’ 



that is, 


Averages 


57 


TT Ayr ow 

tl M. = — = 6i minutes per unit. 

In 360 minutes they will complete —= 288 units. 

25 

2. A tourist purchases gasoline at three stations, as follows: 

Number of gallons of 
Station gasoline for $1.00 

A 5 

B 7 

C 6 

Here the prices are given in the form c/m and it would seem appropriate to regard 
gallon (c) as the fixed element and prices (m) per gallon as the variable quantities 
which are to be averaged. Hence, replacing d/t by c/m and “ rates ” by “ prices ” 
in Rule 1, we are led to find the harmonic mean. 


H.M. 


3 

i + i + ^ 


630 , 

= — gals, per $1.00 
$107 

= —per gal. 


Exercises 

1. The arithmetic mean of a set of 30 numbers is 82. What is the sum of these 

numbers? 

2. In chemistry a student was graded 65 in final examination, 85 in recitation, 

and 80 in laboratory. These grades were weighted 1,2, and 3 respectively! 
Find the student’s average grade. 

3. At the end of his first semester in college a freshman had credits as follows: 

4 hours of mathematics with a grade of 88, 4 hours of English with a grade 
of 80, 3 hours of history with a grade of 85, and 4 Lours of physics with a 
grade of 78. What was his average grade per hour of credit? 

4. Find the median of Table 12. 

5. The population of a city increased in 5 years from 225,000 to 245,000. What 

was the average increase per year? What was the average annual rate of 
increase? 

6. The number of bacteria in a certain culture was found to be 4 X 10 6 at noon of 

one day. At noon the next day the number was found to be 9 X 10 6 . 
If the number increased at a constant rate per hour, how many bacteria 
were there at midnight? 

7. For two positive numbers, a and b , the geometric mean is x = y/ab. This is 

also called the mean proportional between a and 6, since a : x = x ; 6. 
By drawing a semicircle on a + 6 as diameter, show how the value of x can 
be constructed geometrically. 


I 


58 


Mathematics of Statistics 


8. The following table gives the population of the U. S. at each 10-year census 
from 1860 to 1920. 


Year 

X 

Population 

(millions) 

Ratio of Each Census 
Figure to Preceding 

1860 

31.4 


70 

38.6 

1.23 

80 

50.2 

1.30 

90 

63.0 

1.25 

1900 

76.0 

1.20 

10 

92.0 

1.21 

20 

105.7 

1.15 


What is the average rate of increase per decade? Using this average, 
estimate the population for 1930 from the 1920 census figure. 

9. If a series of positive variates form a geometric progression show that their 
logarithms form an arithmetic progression. 

10. Find the geometric mean of the following: 

(a) 2, 4, 8, 16, 32. 

(b) 47, 92, 123, 218. 

11. Given two sets of n positive variates each: 

Xu, #12, Zl3> * * • » %ln 
352l» 3/22> 3?23> * * * j 3/2». 

Prove that the geometric mean of the ratios of corresponding variates in 
the two sets is equal to the ratio of their geometric means. 

12. (a) For a frequency distribution of positive variates show that (10) becomes 

G.M. = [** • i2 /2 • • • 

where k is the number of different values of x in the set, any exponent fi is 

k 

the number of times Xi is repeated, and N = S/i. 

(b) What is the expression for log G.M. when G.M. is defined as in (a)? 

13. A wholesale firm has twelve travelling salesmen who make trips of essentially 

the same length. Of these, eight make their trip in 20 days and four in 15 
days. What is the average time per trip? Ans. 18 days. 

14. State two rules for averaging prices similar to those given for averaging rates. 

Give illustrations. 

15. Consider any two positive variates x x and x 2 . Prove that their geometric 

mean is equal to the geometric mean between their arithmetic mean and 
their harmonic mean. 

16. (Burgess) The following problem arose in a statistical office in Washington 

during the World War i Suppose 20 boats make six trans-atlantic trips 
each per year, giving as the time for a “ turn around ” (i.e., time between 





Averages 


59 


consecutive departures from the same ports), one-sixth year — approxi¬ 
mately 60 days, and that 10 boats make 4 trips per year, giving as their 
time for a “ turn around ” one-fourth year, approximately 90 days. (A 
year of 360 days is used merely for convenience.) What is the average 
number of days per turn around? 

Hint. If we think of the rates expressed as “ trips per year ” then 
x = d/t. If t is regarded as the fixed element, then by Rule 2 the arith¬ 
metic mean is indicated, and x — 6 for 20 values of x, and x = 4 for 10 
values. 

If we think of the rates expressed as “ days per trip ” then x = t/d. If 
t is the fixed element, by Rule 1 the harmonic mean is the correct average, 
and x — 60 for 20 values and x — 90 for 10 values. Ans. 5J trips per 
year or 67.5 days per trip. 

17. Show that if 2a is the harmonic mean of the two rational numbers b and c, 

then the sum of the squares of the three numbers a, b, and c is the square 
of a rational number. 

(Reference: American Mathematical Monthly, June 1935, p. 394.) 

18. (a) If A, G, and H represent, respectively, the arithmetic, geometric, and 

harmonic means of N unequal positive variates, prove that 

H < G < A 

(Reference: Burgess’ text, p. 101.) 

(6) What can you say if the N positive variates are equal? 


CHAPTER IV 
MOMENTS 

1. Moments about an Arbitrary Origin. One of the general prob¬ 
lems of statistics is to summarize and characterize data. In the 
words of R. A. Fisher, 

A quantity of data which by its mere bulk may be incapable of entering the 
mind is to be replaced by relatively few quantities which shall adequately rep¬ 
resent the whole, or which, in other words, shall contain as much as possible, 
ideally the whole, of the relevant information contained in the original data. 1 

These “ relatively few quantities ” are usually expressed in terms 
of moments. Moments are of different orders and the student is 
already familiar with what is now to be known as the first moment, 
namely, the arithmetic mean of the first powers of the variates. We 
will also need in our work the arithmetic means, respectively, of the 
second, third, and fourth powers of the variates. With reference to 
an arbitrary origin, moments are denoted by v (the Greek letter nu) 
with a subscript specifying the order. 

The first four moments, relative to the a>origin and in the x unit, 
are defined as follows: 

= = * 

=^IM 2 

= ^/M 


(l) 


Vi 

V 2 

V 3 

V 4 


i varying from 1 to k. 

A more general definition of the v’s is 

1 k 

(I&) Vr ~ Tt i{.X% XoY 

A i 


1 Foundations of Theoretical Statistics f Philosophical Transactions of the Royal 
Society, vol. 222A (1922), p. 309. 


60 



Moments 


61 


for the rth moment about an arbitrary point a: 0 . When x 0 = 0 and 
r = 1, 2, 3, and 4, we have the definitions stated in (1). If r = 0 
we have the zero th moment and v 0 = 1. 

In statistics we work with moments per unit frequency. The 
term “ moment ” has its origin in mechanics where we speak of the 
moment of a force.” Suppose we have a rigid bar, called a 
lever, with one point of support known as a fulcrum (Figure 10). If 
a force fi is applied to the lever 
at a distance xi from the fulcrum f 

0, the product xji is called the X| ' If* 

moment of the force. If there o ,- y _ | 

are two or more such forces ft, x 

!h • • •, fk, acting in the same j 

direction, and at the distances 

x u xz> • • x k , respectively from 0, the total moment of all these 
forces is 

flXl + / 2 X 2 + • • + f k X k = ZfiXi. 

If the distances x are squared, we have Ylfixi 1 as the total second 
moment, and fix/ represents the rth moment. 

It is by analogy with this mechanical concept that the expressions 
in (1) are called statistical moments (per unit frequency) about zero 
as origin. 


Exercises 

1. Write out the expanded form of the f’s defined in (1). 

2. Calculate the values of v\, p 2 , and v 3 for the following distributions; 


(a) (b) 


X 

/ 

X 

/ 

0 

1 

-3 

1 

1 

3 

-2 

3 

2 

5 

-1 

5 

3 

10 

0 

5 

4 

5 

1 

3 

5 

2 

2 

1 


3. (a) Prove that p 0 is always equal to unity. 

(b) Prove that moment^ of even order are always positive or zero, but that 
moments of odd order may be positive, negative, or zero. 




62 


Mathematics of Statistics 


(c) Show that the odd moments are all zero if both the x’a and fa are sym¬ 
metrical with respect to the origin of x, as, for example, 


X 

-1.5 

-1.0 

-0.5 

0.5 

1.0 

1.5 

f 

1 

2 

3 

3 

2 

1 


2. Moments in Units of the Class Interval. In Chapter III, 
§8, the mean in the x unit was obtained by first finding the mean in 

the u unit, viz., 1 and then changing over into the x unit by 

multiplying by the interval c. In our subsequent work, which re¬ 
quires the higher moments, we shall find it convenient to use a similar 
procedure, and find those moments in the u unit, where u = 
(x — x 0 )/c. It is desirable, therefore, in labeling the moments for 
any distribution, to specify whether they are in the unit of x or u. 
This is commonly done by the use of a second subscript on v. Thus 
v r:u denotes the rth moment in the u unit and relative to the u-origin. 
Therefore, 

Vl= ^2/<u< = u 
V 2 

(2) v _ly fl/3 

V 3;u “ jy 2^J * U » 

V 4 :u = £/^- 

Similarly, v r , x will mean 1 'EfiXf. When there is no ambiguity, the 

second subscript on v may be omitted. 

3. Moments about the Mean. Formulas (1) and (2) define the 
moments taken about zero as origin although in different units. 
When the mean is chosen as origin we have the most important set 
of moments in the theory of statistics. In this case the Greek letter 
(i (mu) is used to denote the moments, and it is always understood 
that the use of n specifies the mean as origin. It does not, however, 
designate the unit, so the second subscript may still be necessary. 
Therefore, the rth moment about the mean is defined by either of the 



Moments 


63 


following expressions: 

Fv:z = ^ £/«(*< - X) r 

(3) 1 

V-r:u =^Z/i(«< - «) r * 

The mean is a sort of balance point. If weights proportional to 
the frequencies are suspended along a horizontal bar at distances 
from one end proportional to the numbers representing the class 
marks, then the bar will balance at the mean of the distances. In 
mechanics this point is known as the abscissa of the center of gravity 
or centroid. Theorem VI of Chapter III, §7, is another way of say¬ 
ing that the given distribution is in equilibrium about this point. 

4. Relations between the n’s and v’s. We shall see that the de¬ 
scriptive constants mentioned at the beginning of the chapter are 
defined in terms of the moments about the mean, but the moments 
about an arbitrary point are easier to calculate. In other words, 
what we desire are the values of y T) but their computation directly 
from the definitions (3) may be very laborious even in the u unit due 
to the fact that (u — u) usually involves decimals. Raising these 
decimals to the second, third, and fourth powers becomes tedious 
even with the aid of a computing machine. On the other hand, the 
p’s defined in (2) are readily computed. Therefore, instead of com¬ 
puting the fi’s directly we obtain them indirectly from the p’s. The 
relations between the y’s and p’s can be found by expanding, by the 
Binomial Theorem, either of the expressions following the 2 2’ s m 

(3) for r = 2, 3, 4. This is done in the u unit as follows: 

v-i - “) 2 

- 25 • ^ + « 

= v 2 — 2uvi + U 

(4) = v 2 — (Vj) 2 , since u = v L 

V* = jj - u ) 3 

= v» - 3 v 2 • V! + 2 (vO 3 
m = v 4 - 4 v 3 • Vi + 6v 2 ( Vl ) 2 - 3(V!) 4 . 


(5) 

( 6 ) 



64 


Mathematics of Statistics 


These formulas are important and the student should be able to 
derive them. It should be apparent that these moment relations 
hold also in the x unit. However, if we have the ju’s in the u unit 
and we desire them in the x unit they may be found as follows: 


( 7 ) 


1*2:* — C 2 H2 :tt 

\h:x = 

:u» 


The first of the relations given in (7) is proved below. The others 
may be proved in a similar manner. 


M 2 :® = — ^ fi{xi — x) 2 by definition, 

= + cui — Xo — cu ) 2 by (4a) and (5), Chapter III, 


c z 

- U) 2 = C 2 M2:«. 

We see that the indirect method of computing the ju’s (in the u unit) 
involves two steps. First the v y s are computed according to the 
definitions in (2). This step is illustrated in Table 18. Then we 
calculate the /z’s by substituting the computed v’s in relations (4), 
(5), and (6). The s in the x unit could then be obtained, if desired, 
by means of (7). 

Before proceeding with the second step it is desirable to check the 
j^s or, at least, the totals of the columns from which they are ob¬ 
tained. This can be done if we have another column headed 
f(u + l) 4 , and observe that 

+ l) 4 = I> 4 + 4 Jjfa* + 6 J^fu 2 + + 2/* 

This is known as Charlier’s check. An alternative one is to check the 
entries in the column fu 4 against the proper entries in Pearson’s 
Tables for Statisticians and Biometricians , Table L. 

Charlier’s check is a necessary but not a sufficient check. That is 
to say, compensating errors may occur which this check would not 
detect. However, the occurrence of such errors is very unlikely. 

Applying Charlier’s check to Table 18 we have 

1220 = 1088 + 4(—236) + 6(176) + 4(-20) + 100 = 1220 



Moments 


65 


Table 18 — Moments for Distribution of Grades 


Data 




Commutations 


X 

f 

u 

fu 

fu 2 

fu z 

fu 4 

f(u + l) 1 

34.5 

2 

-4 

- 8 

32 

-128 

512 

162 

44.5 

3 

-3 

- 9 

27 

- 81 

243 

48 

54.5 

11 

-2 

-22 

44 

- 88 

176 

11 

64.5 

20 

-1 

—20 

20 

- 20 

20 

0 

74.5 

32 

0 

0 

0 

0 

0 

32 

84.5 

25 

1 

25 

25 

25 

25 

400 

94.5 

7 

2 

14 

28 

56 

112 

567 

Sums 

100 


-20 

176 

-236 

1088 

1220 

~ Sums 
N 

1 


-.20 

v\ \U 

1.76 

V2:u 

-2.36 

vz:u 

10.88 

For Charlier’s 
check 


Hence we may proceed with confidence to compute the /x’s. Using 
relations (4), (5), and (6): 

M2:u = 1.76 - ( — .20) 2 = 1.72 

M3: u = - 2.36 - 3(1.76)( — .20) + 2(-.20) 3 = -1.320 

Him = 10.88 - 4(—2.36)( —.20) + 6(1.76)(-.20) 2 - 3(-.20) 4 
= 9.4096 

Before explaining the applications of /x 2 , Hh and /x 4 we present some 
exercises which will aid the student in mastering the procedure thus 
far developed. 


Exercises 

1. (a) Verify relations (4), (5), and (6). 

(b) Show that these relations hold also in the x unit. 

(c) Prove that m = 0 in any unit. 

(d) When / = 1, show that 

1 N 

M2:* — ~ S x * 2 “ 

N i = i 

2. Verify the relations given in (7). 

3. Using Table 18 as a model find the v’s for Iowa City rainfall by extending 

Table 17. 

4. Find the m’ s from your results in Exercise 3 above. 







66 


Mathematics of Statistics 


5. Standard Deviation. Formula (4), M 2 = v 2 — ^i 2 , is perhaps 
the most important of the moment relations for elementary statistics. 
It states that the second moment about the mean is equal to the 
second moment about zero diminished by the square of the mean 
measured from zero. 

Many of the definitions in statistics are essentially those of physics 
and mechanics. The analogy between the mean and centroid has 
been mentioned. The above statement about formula (4) is a well- 
known proposition in mechanics when the word centroid is substi¬ 
tuted for mean. 

In mechanics the equivalent of Ny 2 is called the moment of inertia 
(about the axis through the centroid) and (M2) 172 is the radius of gyra¬ 
tion. These notions are carried over in statistics. Suppose a thin 
metal plate in the shape of a histogram is rotating about a vertical 
axis through its centroid. There is a distance from the centroid at 
which the entire mass of the histogram could be concentrated 
without changing its moment of inertia. This distance is the 
square root of M 2 . It is an average rotational radius for all par¬ 
ticles of the rotating mass. In statistics, (M 2 ) 1/2 is called the stand¬ 
ard deviation and is denoted by the small Greek letter <r . Therefore, 
we have 

<r x 

(8) | =V / |^ 

= ccr u . 

We shall see later that a is a measure of what is called dispersion. 
More precisely, it measures the extent to which the data are spread 

out “ on the average ” on either 
side of the mean. (See Figure 11.) 
The student will obtain a more 
complete understanding of <7 as the 
course develops. 

The mean and standard devia¬ 
tion are always expressed finally in 
the same units as the variates. If 
x represents inches, we desire the 
mean and standard deviation in 
inches. When obtained they should be labeled appropriately. 



x 


Fig. 11 






Moments 


67 


Example. For Table 18, we have 

x = CM + X, = 10( — .20) + 74.5 = 72.5% 

<t u = vW = (1.72)** = 1.31 
v* = c* u = 10(1.31) = 13.1%. 

Thus, we have explained the use of the first and second moments. 

The student will observe that the change from <r u to <r x does not in¬ 
volve x 0 . The standard deviation is affected by the change in units 
but is independent of the origin of reference. To prove this let 
x' — x — Xo, whence x' = x — xo (why?). Then 

ffs- 2 = HI:* = 'EM** ~ *') 2 

= hEffci - Xa - X + xtf 

N 

= ll 2;x — O’* 2 - 

This suggests the more general 

Theorem. The value of y r remains invariant under a transforma¬ 
tion which changes only the origin of reference of the variates . 

The student is asked to prove the equivalent of this theorem in 
Exercise 3 after §9. 

6. Standard Units. The above section explains There re¬ 
mains the explanation of yz and y*. We will lead up to this by de¬ 
fining standard units. We have mentioned the transformation 
x* = x — x. Another very useful transformation consists in measur¬ 
ing such deviations from the mean in units of the standard deviation, 
a x , of the entire distribution. They are then known as standard 
units and will be designated by t. Thus, 


x - X x f 



Graphically, this translates the origin to the mean and measures dis¬ 
tances along the horizontal axis in terms of ov It is a special case 
of the more general transformation 

X — Xo 

u = -• 

c 




68 


Mathematics of Statistics 


The significant characteristic of the t variate is its independence of 
the unit in which the original measurements were taken. For ex¬ 
ample, suppose we were concerned with obtaining the linear measure¬ 
ments of a set of individuals. One distribution of variates would 
result if the measurements were made in feet. In this case x', x, and 
<r x would also be in feet. If the measurements were taken in inches, 
then x ', x, and <t x would be in inches, and each of these values would 
be, numerically, twelve times as large as the corresponding numbers 
in the first distribution. However, the variates expressed in standard 
units would be the same for the two distributions. Thus if 

x = 50 ft. = 50(12) in., 

and 

<t x = 5 ft. = 5(12) in., 

then for an individual measurement of x = 60 ft. = 60(12) in., we 
have 

t _ 10 ft. 10(12) in. 

5 ft. 5(12) in. 

<=2 = 2 . 

It is obvious, therefore, that standard units provide a basis for 
comparing distributions. Moreover, they make possible important 
simplifications in certain mathematical operations. 

With the aid of a computing machine, a distribution may be easily 
transformed into standard units by means of the so-called continuous 
process. To illustrate, suppose for the distribution of Table 9 (§10, 
Chapter I), it has been found that 

x = 47.712 lbs. 

<fx = 5.772 lbs. 

By relation (9), then, 

r — 47 719 

t = — 5 ?72 = .17325a; - 8.2661. 

Referring to the discussion of the continuous method given in the 
Introduction, we observe that here k = 8.2661, n = .17325, and we 
desire the values of t corresponding to the values of x given in Table 9. 
For the values of x such that nx < k, we write the above relation in 
the form 


-t = 8.2661 - .17325a;. 


Moments 


69 


The procedure now is to register 8.266100 on the lower dial, punch the 
constant factor .17325 on the keyboard, and then by turning the 
crank backward so that the successive values of x appear on the upper 
dial, we subtract from k the products of this multiplier and the values 
of x. The various values of x are built over from one to another 
without clearing the dial. The resulting values of —t are read at 
each stage from the lower dial until we get —t = 0.383. From here, 
nx > k y so we clear the dials and start over using the original form 
of the relation between x and t. We now register —8.266100 on the 
lower dial by turning the crank backward, punch .17325 on the key¬ 
board, and turn the crank forward to form the values of x on the 
upper dial. The values of t are read as before from the lower dial at 
each stage of the build-over process. In this way the following set 
of standard variates is obtained: 


Table 19 


X 

/ 

t 

29.5 

1 

-3.155 

33.5 

14 

-2.462 

37.5 

56 

-1.770 

41.5 

172 

-1.076 

45.5 

245 

-0.383 

49.5 

263 

0.310 

53.5 

156 

1.003 

57.5 

67 

1.696 

61.5 

23 

2.389 

65.5 

3 

3.082 


We see from Table 19 that a range of t = ±3 takes in practically all 
the variates. This is typical of the more common distributions. 

If x = 0, then t = x/<x and the origin of t is the same as the origin of x. 
Some writers use X to denote the variates (i.e.y pounds, dollars, tem¬ 
peratures, etc.), and use x to denote deviations from the mean. In 
that notation, t — x/a would have the same meaning as our equation 
(9). Occasionally in later chapters we shall find it convenient to 
designate deviations from the mean by x (instead of x'). If so, it 
will be stated that the origin of x is at the mean or centroid. 

7. Moments in Standard Units. The moments in standard units 
are denoted by the Greek letter alpha, a. Thus for the rth moment 








70 


Mathematics of Statistics 


in standard units, we have a r = ^ ^2fiU r . However, it is not neces¬ 
sary to transform the variates into t units in order to compute the a! s. 
We shall show that they are functions of the /i’s. Thus 



a r — 

by definition 



from (9) 

Hence 

1 

<Jx 

Jj JlMXi - x)\ 

Why? 

(10) 

( 

Br " (*.)' 

Why? 



Kt.x 

from (8). 

Letting r 

=■= 1, 2, 3, 4 in (10) we have 




a 1 = ^=0 

<*x 


(10a) 


a 2 — “Y — 1 
<T* 2 

\h:x 

** = (*.)■ 




H*4:z 

a ‘ _ (*.)*’ 



It is obvious that a i and a 2 are abstract numbers. This is also the 
case for the other a’s. In the expressions for a z and a 4 both numera¬ 
tor and denominator are of the same dimension. That is to say, in 
a 3 — nz/o* both numerator and denominator are the cubes of what¬ 
ever unit is used in the original measurements, and therefore their 
ratio is of zero dimension, a pure number. Similarly, in a 4 — fn/cr* 
both numerator and denominator are the fourth powers of the same 
unit, and therefore a 4 is an abstract number. 

Some writers use Vft for our a s and ft for our a 4 . 





Moments 


71 


8. Use of a 3 and a 4 . Since ai and <x 2 have the same values for all 
frequency distributions, their computation contributes nothing to 
the description or characterization of a distribution. But the values 
of a 3 and a 4 depend upon the shape of the histogram representing a 
distribution, and are therefore useful in distinguishing between types 
of distributions. Thus, we observe that 

M3 = Z)/0 - X) 3 


is a measure of asymmetry about the mean. If the variates are dis¬ 
tributed symmetrically about x then ^3 = 0. But if the positive 
deviations from the mean outweigh the negative deviations then 
ji 3 > 0, whereas if the negative deviations predominate, then ju 3 < 0. 
Cubing the deviations gives a measure which is sensitive both to their 
size and sign but the result is in cubic units. Now symmetry, or lack 
of it, is not a function of the original units of measurement, so if we 
divide /x 3 by a 3 we get a pure number. Thus a 3 is a satisfactory meas¬ 
ure for comparing symmetry in distributions of different units of 
measurement. 

The quantity a 4 measures a characteristic called “ kurtosis.” It 
refers to the relative number of variates in the vicinity of the mean. 
More will be said about a 3 and a 4 later on. At this time emphasis 
should be placed upon their calculation rather than upon the infor¬ 
mation which they yield. 

Inasmuch as the a s are independent of the unit of measurement, 
they may be computed from the moments in the u unit. Changing 
these moments into the x unit would only introduce the same factor 
into the numerator and denominator, which would of course cancel 
out. Thus: 

M3:x _ ^M3:« _ M3:« 

3 Ox CV « 3 CT U S 

_ M4:x _ C 4 fX 4:x _ fX 4 :u 

<Jx C 4 a u 4 (Tu 4 


For Table 18 we have 


<*3 


-1.320 
(1.72) (1.31) 
9.4096 


(1.72)= 


= 3.18. 


0.586, 


a 4 = 





72 


Mathematics of Statistics 


Although no limits can be placed on the possible values which a z 
and a 4 may take, it may be said that for the more common distri¬ 
butions a 4 fluctuates around 3 and a z is usually not more than 2 nor 
less than —2. We cannot go into the theoretical reasons for these 
values and we mention them here merely to guide the student as to 
what is a reasonable result to expect in the exercises in this book. 
When the numerical value of 0:3 is large, the distribution may be of 
the J-shaped type which is an extreme form of the asymmetrical type. 
However, these types cannot always be distinguished by elementary 
methods if the original data are not available. 

9. Summary. The quantities x ) <r x , a z , and a 4 are called the de¬ 
scriptive constants of the distribution. They are the “ relatively 
few quantities ” (§1) which, in certain cases, contain all the relevant 
information in the distribution. Table 20 will serve as a model for 
the procedure which the student should follow in computing these 
quantities. Of course, if the work is done on a computing machine, 
only the totals of the power sums need be recorded. The detail of 
the columns may be omitted. In Table 20, c = 1, so a x = <r u . Of 
course, this would not be true in general. 

The calculation of the v’s proceeds naturally as an extension of the 
work required to compute x for a frequency distribution. Thus to 
obtain x we first compute vi :u and then obtain x from the relation 

X = CU + Xq. 

To obtain the standard deviation we need the value of v 2 because a x 
is found from the relations 

JJL 2 = V2 ~ U 2 

<T U =VM2:« 

(X X == C(Ty,» 

The next chapter is devoted to a discussion of dispersion of which a x 
is a'measure. To be sure, the standard deviation is only one of several 
measures of dispersion, just as the mean is only one of several averages. 
But both the mean and the standard deviation play important roles 
in the theory and practice of statistics. The student should master 
the pattern by which they are computed in a frequency distribution. 

In order to compute a 3 and a 4 we first require v 3 and v A (in addition 
to vi and r 2 ). Then and M 4 are obtained from (5) and (6). Finally, 



Moments 


73 


is computed for r = 3 and r = 4. The characteristics of a distri¬ 
bution which a 3 and a 4 describe will be discussed in Chapter VI and 
again in Part II. In elementary work they are less important than 
x and <J X - 


Exercises 

1. (a) What is the numerical value of the mean of any distribution of variates 

expressed in t units? 

(6) What is the standard deviation of such a distribution? Hint: at = Va 2 . 

2. (a) Show that (x — x) = c(u — u) and hence that t = (u — w)/<r u . 

(6) Show that we obtain the same results for the a’s if we take 

u — u 
t =- 

3. Prove: If any constant be added algebraically to each variate of a series the 

values of Mr for the new series will be identical to the corresponding values 
of Mr of the original series. 

4 . Suppose each variate is multiplied by a constant. What effect would this 

have on x , <r zy a 3 , and a 4 ? 

5. Show that the standard deviation of x may be written 

a. = [££/«(* - 

-I 

6. Prove the general relation 

Mr:* = C f Mr:« 

of which the relations given in (7) are special cases when r — 2, 3, 4 
Hint: (x — x) — c(u — u), 

7. (a) Show that a 0 = 1. 

(i b ) Show that o- r = (M 2 ) r/2 in both the x and u units. 

8. Prove from (4) that M 2 is less than or at most equal to v 2i the same unit being 

used in each case. 

9. Find x f <r x , « 3 , and a* for Iowa City rainfall using your results from Prob¬ 

lem 4 of the preceding set of Exercises. 

Ans. 

x = 2.80 in. — 1.29, 

<r x = 2.01 in. = 4.58. 

10. Using Table 20 as a model find x, <r x , a 3 , and a 4 for the distributions in §10, 
Chapter I, according to the direction of the instructor. 




74 


Mathematics of Statistics 


Table 20 — Specimen Worksheets for Computing the Characterizing 
Constants of a Distribution 


Subject: Span among Adult Males (Table 13) 


X 

/ 

u 

uf 

u 2 f 

u z f 

u 4 f 

(u + m 

58.5 

1 

-11 

- 11 

121 

-1,331 

14,641 

10,000 

59.5 

2 

-10 

- 20 

200 

-2,000 

20,000 

13,122 

60.5 

1 

- 9 

- 9 

81 

- 729 

6,561 

4,096 

61.5 

6 

- 8 

- 48 

384 

-3,072 

24,576 

14,406 

62.5 

7 

- 7 

- 49 

343 

-2,401 

16,807 

9,072 

63.5 

22 

- 6 

-132 

792 

-4,752 

28,512 

13,750 

64.5 

55 

- 5 

-275 

1,375 

-6,875 

34,375 

14,080 

65.5 

111 

- 4 

-444 

1,776 

-7,104 

28,416 

8,991 

66.5 

146 

- 3 

-438 

1,341 

-3,942 

11,826 

2,336 

67.5 

182 

- 2 

-364 

728 

-1,456 

2,912 

182 

68.5 

229 

- 1 

-229 

229 

- 229 

229 

0 

69.5 

265 

0 

0 

0 

0 

0 

265 

70.5 

263 

1 

263 

263 

263 

263 

4,208 

71.5 

217 

2 

434 

868 

1,736 

3,472 

17,577 

72.5 

176 

3 

528 

1,584 

4,752 

14,256 

45,056 

73.5 

132 

4 

528 

2,112 

8,448 

33,792 

82,500 

74.5 

82 

5 

410 

2,050 

10,250 

51,250 

106,272 

75.5 

48 

6 

288 

1,728 

10,368 

62,208 

115,248 

76.5 

20 

7 

140 

980 

6,860 

48,020 

81,920 

77.5 

16 

8 

128 

1,024 

8,192 

65,536 

104,976 

78.5 

12 

9 

108 

972 

8,748 

78,732 

120,000 

79.5 

3 

10 

30 

300 

3,000 

30,000 

43,923 

80.5 

1 

11 

11 

121 

1,331 

14,641 

20,736 

81.5 

2 

12 

24 

288 

3,456 

41,472 

57,122 

82.5 

1 

13 

13 

169 

2,197 

28,561 

38,416 

Sums 

2,000 


886 

19,802 

35,710 

661,058 

928,254 

1 /N (Sums) 1 


.443 

9.901 

17.855 

330.529 





u 

VI 

V3 




Charlier’s check: 

£(« + D 4 / = £« 4 / + 4 £>»/ + e + 4E«/ + £/ 

928,254 = 661,058 + 4(35,710) + 6(19,802) + 4(886) + 2,000 = 928,254 




Moments 


75 


Computations: 


x = cu + x 0 = (1)(.443) + 69.5 = 69.943 in. 
u 2 = .196249 

u 3 = .086938, u* = .038514 

fJL2 — V2 “ w 2 

= 9.901 - .196249 = 9.704751 
<r w = V9.704751 = 3.115 
<j x = c<r w = (1)(3.115) = 3.115 in. 

Hz = vz — 3^2^ H - 2iZ 8 

= 17.855 - 3(9.901)(.443) + 2(.086938) 

= 17.855 - 13.158429 + .173876 
- 4.870447. 


m = Vi *— 4 *> 3 i 2 + 6^2 u 2 — 3 u 4 

= (330.529) - 4(17.855)(.443) + 6(9.901)(.96249) - 3(.038514) 
= 330.529 - 31.639060 + 11.658368 - .115542 
= 310.432769 

<T U * = (3.115) (9.704751) = 30.230299 
94.182192 

4.870447 


30.203299 

310.432766 

94.182192 


« .161 


= 3.296 


Summary: 

x = 69.943 in.; a 3 = 0.161; 

<r x = 3.115 in.; a 4 = 3.296. 


10. Sheppard’s Corrections. The moments of a frequency dis¬ 
tribution are computed on the assumption that each variate value in 
a class interval has the value of the class mark for that interval. This 
has the effect of replacing the actual data by somewhat fictitious data 
assigned arbitrarily at the central values of the intervals. Evidently 
a very coarse grouping might be misleading and it can be shown math¬ 
ematically that the above assumption introduces a systematic error, 
called a grouping error, in the results obtained for the second and 
fourth moments about the mean but does not effect pi and /x 3 . To 
eliminate this systematic tendency certain corrections are applied to 
p 2 and Pi. 

The derivation of these corrections is beyond the scope of an ele¬ 
mentary course, but it may be worth while to see why it is that cor¬ 
rections are necessary for some moments and not for others. A dia¬ 
gram will aid our thinking. Suppose a smooth curve represents the 
true frequency distribution while the histogram represents the dis¬ 
tribution with class marks as the variates. Since the moments are 




76 


Mathematics of Statistics 


computed from the distribution represented by the histogram, we 
scarcely expect our results to be exactly the values of the moments 
of the true distribution, which are, of course, what we seek. In using 
the distribution represented by the histogram, we are neglecting, for 
each rectangle, the little area under the curve shaded A and sub¬ 
stituting for it the little area shaded B. In general B is a little larger 
than A, as is illustrated by Figure 12. The excess of B over A for 
those rectangles to the left of x will be negative, the corresponding 
excess for those rectangles to the right of x will be positive. This 



may be readily understood by considering these little areas as ap¬ 
proximate triangles whose bases are negative or positive according 
as they are to the left or right of x. These excesses for all the rec¬ 
tangles, both positive and negative, are involved in taking the sum¬ 
mation y^fiixj — x) r for the moments. When r is an odd number, 
as 1 or 3, the excesses show up with their algebraic signs and there¬ 
fore, over the range of the distribution, the positive excesses just 
about offset the negative ones. But in the case of the even moments, 
all the excesses now become positive so that the errors accumulate 
and the final results for these moments are too large. 

To reduce these errors due to grouping, W. F. Sheppard has demon¬ 
strated 1 that the following corrections should be applied. It should 
be noticed that as we state them here they should be applied only 
where the class interval is unity, i.e., in the u unit. 

1 Students familiar with more advanced mathematics will find an interesting 
discussion of systematic errors and references to papers dealing with Sheppard’s 
corrections in an article by H. C. Carver, Annals of Mathematical Statistics , 
vol. 7, p. 154. 



Moments 


77 


Corrected M 2 :u = uncorrected ix-i-.u ~ ~ 

Corrected #t 3: « = uncorrected n 3:it 

1 7 

Corrected M 4 :u = uncorrected m-. u — - (uncorrected M 2 ;«) + — 

(s - 0 08333 ’ So ' 0 02917 )' 

Example. For Table 18 we have 

Corrected iir.u — 1720 — .083 — 1.637 
cr* = Vl.637 = 1.28 
Corrected <r x “ 10(1.28) = 12.8% 

Corrected m: u = 9.4096 — (1.72)/2 + 7/240 
= 8.5788 

a 4 = 8.5788/(1.637) 2 - 3.20. 

The values of x and a 3 remain unchanged. 

Sheppard's corrections are valid only for the bell-shaped types of 
distributions. They are not applicable to the J-shaped or U-shaped 
types. Moreover, they constitute a refinement which may not al¬ 
ways be consistent with the degree of accuracy in the original data. 
The errors of grouping (not mistakes) are usually small compared 
with the errors existing in the raw data. So, it seems that little 
would be gained by their use in a first course. We will occasionally 
use them in an illustration. 




CHAPTER V 

MEASURES OF DISPERSION 

1. Introduction, The concept of variability is fundamental today 
not only in the social sciences but also in the so-called exact physical 
sciences. Modern scientific method recognizes the existence of 
physical, moral, and mental inequalities. The principle of variabil¬ 
ity has come to be accepted as the natural order in social, economic, 
and physical phenomena. This principle is the very essence of the 
statistical nature of mass phenomena. In this connection, R. A. 
Fisher says: 1 

The conception of statistics as the study of variation is the natural outcome of 
viewing the subject as the study of populations; for a population of individuals 
in all respects identical is completely described by a description of any one indi¬ 
vidual, together with the number in the group. The populations which are the 
object of statistical study always display variation in one or more respects. 
To speak of statistics as the study of variation also serves to emphasize the 
contrast between the aims of modern statisticians and those of their predecessors. 
For, until comparatively recent times, the vast majority of workers in this field 
appear to have had no other aim than to ascertain aggregate, or average, values. 
The variation itself was not an object of study, but was recognized rather as a 
troublesome circumstance which detracted from the value of the average. . . . Yet, 
from the modern point of view, the study of the causes of variation of any vari¬ 
able phenomena, from the yield of wheat to the intellect of man, should be 
begun by the examination of the variation which presents itself. The study 
of variation leads immediately to the concept of a frequency distribution. 

Therefore in studying a distribution it is important to describe 
how the variates are clustered or scattered around an average. 
Figure 13 shows how two distributions may even have the same 
mean and total frequency, yet differ considerably in variation from 
the mean. Such variation is commonly called dispersion, varia¬ 
bility, or spread. 

To measure the dispersion of the variates is an important statis¬ 
tical problem. We will consider three measures: Quartile Deviation , 

1 R. A. Fisher, Statistical Methods for Research Workers , p. 3. Oliver and Boyd, 
London. 


78 



Measures of Dispersion 79 

Mean Deviation, and Standard Deviation, of which the last is by far 
the most important. 



Fig. 13. Two Distributions Differing in Dispersion 


2. The Quartile Deviation. Just as the median selects one point 
of division, we may now take two additional points such that they, 
together with the median, divide the whole distribution into four 
equal parts. These points are called the quartile values. 

The first quartile, denoted by Qi, is that value of x for which 
cumj = JV/4. That is, one-fourth of all the variates in the distribu¬ 
tion are smaller in value than Q, and three-fourths of them are larger 
than Q t . The second quartile Q 2 is that value of x for which cum f 
is Nf 2 and is therefore the median. The third quartile, denoted 
by Qs, is that value of x for which cum f — 3 N /4. Hence fifty per 
cent of the total frequency is included between Qi and Q s . 

Half of the distance between Qs and Q\ is called the semi-inter¬ 
quartile range or quartile de¬ 
viation and will be denoted 
by Q. Thus, 

( 1 ) Q - 

It should be noted that 
the median does not neces¬ 
sarily come at the mid-point 
of 2 Q y i.e that a distance 
Q laid off on either side of 
Q 2 would not necessarily reach to Qi and Q 3 . (See Figure 14.) (For 
a symmetrical distribution, to be considered later, this would 
be true.) 



As a measure of dispersion, Q gives a fairly good idea of the spread 
of the variates, and is suitable as such a measure in those cases where 
the median would be used as an average. The quartile values Qi 


80 Mathematics of Statistics 


and Qz are found, like the median, by interpolation in the cumulative 
frequency table. 


Example, (a) Find the median and the quartile deviation for the distribution 
of IQ’s in Table 6 (§10, Chapter I). (b) Illustrate the measures found in (a) by 

means of a cum f graph. 


End-x 

Cum f 

54.5 

0 

64.5 

3 

74.5 

24 

84.5 

102 

<-Qi 


94.5 

284 

Med. 


104.5 

589 

Qz 


114.5 

798 

124.5 

879 

134.5 

900 

144.5 

N = 905 


Solution : 


iV/4 = 226.25, N/2 = 452.5, 
Qi - 84.5 226.25 - 102 

10 ~~ 284 - 102 

Q 2 - 94.5 452.5 - 284 

10 “ 589 - 284 ’ 

Qs - 104.5 678.75 - 589 

10 “ 798 - 589 


3iV/4 = 678.75 
Qi = 91.3 


Q 2 '= 100.02 


Qz = 108.8 


Q = 



= 8.75 . 


Figure 15 explains graphically the measures obtained by inter¬ 
polation from a cum f table. 


Exercises 


1. Criticize the following “ definitions ”: 


N N 3 N 


2. Find Qi and Q 3 from the cumulative frequency table which you made to 
obtain the median for the Glasgow schoolgirl distribution. (Exercise 5 
on page 50.) 





Measures of Dispersion 


81 


3. Find the quartile deviation Q from your results in Exercise 2. 

4 . Find Qi, Q*, Qz for the distribution in Table 12, and compute Q. 

6. Compute the value of the semi-interquartile range for other distributions 
at the direction of the instructor. 



3. Mean Deviation. As a measure of variation about a central 
value, it would seem appropriate to take an average of all the devia¬ 
tions about that central value. In the mean deviation (MD) about 
the mean this is precisely what we do, namely, we find the arithmetic 
mean of the numerical values of the deviations about the mean. 
In summing the deviations, their absolute values are used because 
regardless of whether deviations are positive or negative they have 
the same influence on the amount of variation. Moreover, if their 
algebraic signs be taken account of, we know the sum of such devia¬ 
tions is zero (Theorem VI of Chapter III). Hence we sum them 
treating all deviations as positive. 

In mathematical symbols, vertical bars denote absolute values, 
so we have 

(2) MD I 

if the x uni t is used. When the class interval is the unit, we have 

(3) MD 

and 

( 4 ) 


MD (x unit ) = c times MD (u unit). 








82 


Mathematics of Statistics 


It can be proved that the essentially positive function 

V = - Ay 

is a minimum when A = x. (See Theorem II, page 95. Also by 
the calculus dy/dA — 0 when A = x). It was in a similar investi¬ 
gation to find the value of B for which the function 


is a minimum, that the median was discovered. When B is the me¬ 
dian this function is a minimum. (For a proof, see Yule and Kendall, 
Theory of Statistics. For an application of this property of the me¬ 
dian in locating centers of industry, see “ Elements of Statistics” 
by Davis and Nelson, page 85.) However, custom has established 
the use of the mean rather than the median in this measure. Hence 
“ mean deviation ” usually refers to the mean deviation from the 
mean. It is also called “ average deviation.” 

Example. Find the mean deviation for the grades in Table 18 where the 
mean value of x is 72.5. 


X 

f 

| x — x | 

f\x-x\ 

34.5 

2 

38 

76 

44.5 

3 

28 

84 

54.5 

11 

18 

198 

64.5 

20 

8 

160 

74.5 

32 

2 

64 

84.5 

25 

12 

300 

94.5 

7 

22 

154 

Total 

100 


1036 


MD = ^7 = 10.36. 
100 


What was a for this distribution? 


The absolute value of a variable x ', denoted by the symbol \x'\, is 
not very tractable in mathematical operations. Therefore the mean 
deviation is not favored by mathematicians since it is unwieldy in 







83 


Measures of Dispersion 

the more theoretical and mathematical discussions. Its chief use 
is in experimental work where occasional large and erratic deviations 
are liable to occur. In such cases the standard deviation would tend 
to emphasize these deviations. 

If m of the N variates are greater than the mean, x, then the mean 
deviation may be written 

ME) = — | (sum of variates greater than x) — mx\ 



The student is given a hint in Exercise 34 at the end of Part I on 
how to prove a similar formula for Xi < x. 

4. The Standard Deviation. To overcome the'difficulty of nega¬ 
tive deviations and the use of absolute value signs, the deviations 
about the mean may be squared and the mean of these squares taken. 
To get back into the original linear units, we take the positive square 
root of this result, and have 

<*> - [££/.<* - n-J* 

as defined before. The standard deviation measures the same kind 
of phenomenon as the mean deviation and this approach to it is 
frequently satisfactory to a student who otherwise finds it difficult 
to understand. 1 

For a common type of distribution, the standard deviation is 
approximately twenty-five per cent greater than the mean deviation. 
Speaking more accurately, this is true of a normal distribution (to be 
considered in Chapter VI) for which the relation is MD = \<j 
(approximately). 

It is often convenient to have a name for “ the square of the 

1 The term “ standard deviation ” was proposed by Pearson and is now used by 
almost all English writers. As originally defined by Pearson, this is the square 
root of the mean of the squares of deviations taken from the mean of the distri¬ 
bution, and is not to be used when deviations are measured from any other 
reference point. Pearson uses the term “ root-mean-square ” for a similar 

measure when the deviations are taken around any origin other than the mean._ 

Walker, History of Statistical Method , p. 54. 


84 Mathematics of Statistics 

standard deviation,” and for this purpose the term “ variance ” has 
been introduced. 

Although definition (5) is the basic concept which the student 
should have for the standard deviation, nevertheless in actual practice 
it is seldom desirable to compute a directly from that definition. For 
a frequency distribution the method is shown in the chapter on 
moments. However we will give another illustration here. 

Example . Find the mean and the standard deviation of Table 9, using 
Charlier’s check and Sheppard’s correction. 

Solution: 


Table 21 — Weights of Glasgow School Children 


Weight ( x) 

/ 

u 

fu 

fu 2 

f(u + 1 ) J 

29.5 lbs. 

1 

-5 

- 5 

25 

16 

33.5 

14 

-4 

- 56 

224 

126 

37.5 

56 

-3 

-168 

504 

224 

41.5 

172 

-2 

-344 

688 

172 

45.5 

245 

-1 

-245 

245 

0 

49.5 

263 

0 

0 

0 

263 

53.5 

156 

1 

156 

156 

624 

57.5 

67 

2 

134 

268 

603 

61.5 

23 

3 

69 

207 

368 

65.5 

3 

4 

12 

48 

75 

Sums 

1000 


-447 

2365 

2471 

1/iV (Sums) 

1 


-.447 

2.365 





u 

v% 



Charlier’s check: ^2f(u + l ) 2 = fu 2 + 2 ^/w + N 

2471 - 2365 + 2(-447) + 1000 = 2471 


Computations: 

x = 49.5 4- 4( — .447) = 47.712 lbs. 
M2:« = V2 — ( u 2 ) = 2.165 


Using Sheppard’s corrections, 


Corrected M 2 = 2.165 — .083 = 2.082 
M 2 :* = 16(2.082) = 33.312 
<r* = V33.312 = 5.772 lbs. 







85 


Measures of Dispersion 

It will be proved later that for a certain ideal type of distribution 
which is often approximated in practical statistics the range x db <r x 
includes about two thirds of the variates. Assuming the above 
distribution is of this type we could say that about two thirds of the 
children weighed between 42 pounds and 53.5 pounds. Such a state¬ 
ment assists one in comprehending certain characteristics of the data 
though the distribution actually may not be before him. 

It is understood that the method of computation described above 
is to be used when the class marks are equispaced. If the class 
intervals are unequal we must choose c = 1 unless the x’s denoting 
the class marks have a common factor c . When c = 1, u becomes 
u = x — Xo, and the work may be simplified a little by an appropriate 
choice of x Q . 


Exercises 


1. (Pearson). The following data represent the percentage of ash-content in 

280 wagon tests of a certain kind of coal, 
deviation of the distribution: 

Find the mean and the standard 

Percentage 

Ash-Content 

Frequency 

3.0- 3.9 

1 

4.0- 4.9 

7 

5.0- 5.9 

28 

6.0- 6.9 

78 

7.0- 7.9 

84 

8.0- 8.9 

45 

9.0- 9.9 

28 

10.0-10.9 

7 

11.0-11.9 

2 

Ans . £ = 7.35%, <r x = 1.36%. 


2. (Camp). Find the mean wage and the standard deviation of the following 
data: 

Class 

Frequency 

$4.50- 5.99 

43 

6.00- 7.49 

99 

7.50- 8.99 

152 

9.00-10.49 

178 

10.50-11.99 

160 

12.00-13.49 

40 

13.50-14.99 

25 

15.00-16.49 

Ans. N = 700, x = $9.42, <r* = $2.19. 

3 


86 Mathematics of Statistics 


3. Given <r x = 2.19 for the following (x, /) distribution, find <t v and cr u for the 
(v, f) and (u, /) distributions, respectively. 


/ 

43 

99 

152 

178 

160 

40 

25 

3 

X 

0 

1.5 

3.0 

4.5 

6.0 

7.5 

9.0 

10.5 

V 

1 



3 

4 

5 

6 

7 

u 

■ 



0 

1 

2 

3 

4 


What relation and theorem in Chapter IV does this illustrate? 


4. Find the variance a x 2 of Table 16 (§8, Chapter III). 

5. Compute the value of the ratio MD/<r for the data in Exercise 1 above. 

6. Find the mean and standard deviation for the data in Table 10. 

7. Find the mean and standard deviation for the data in Table 11. 

5. Relative Dispersions. The full significance of different values 
of <r can only be obtained by experience, but it is obvious that a small 
standard deviation indicates that the variates are closely clustered 
about the mean; whereas a large standard deviation indicates that 
these values are spread out widely from the mean. (See Figure 13.) 

The size of variates usually influences not only the mean but also 
deviations from the mean. In other words, the magnitudes of the 
deviations from the mean seem to be dependent, in some degree, upon 
the magnitude of the mean. In comparing dispersion in distribu¬ 
tions, we may correct for differences in the average magnitudes of 
the variates by taking the ratio of the standard deviation to the 
mean. Thus, the quantity 



is known as the coefficient of variation . It is obviously an abstract 
number, being independent of the units of measurement, and it is 
usually expressed as a percentage. 

6. Scaling a Distribution in Terms of <r. Suppose we lay off 
intervals of length a on either side of the mean (Figure 16). Then 
for a certain type of distribution known as the normal curve (which 
will be considered in the next chapter) the following properties can 
be proved: 

(1) The percentage of the total frequency lying outside the range 
x dz <r is 32% approximately. 

(2) The percentage outside x db 2<r is 5% approximately. 











Measures of Dispersion 


87 


(3) The range x ± 3<r includes practically the whole distribution, 
the total range is 6<r approximately. 

The student will recognize that these ranges are, in standard units, 
t = ±1, £ = ±2, t — ±3, respectively. These results follow from 
the relation 

x — x 

t = - ) x = x + ta. 

<r 

Sometimes it is important in a statistical analysis to know how 
nearly the given variates 
are distributed in accor¬ 
dance with the above prop¬ 
erty of the normal curve. 

The distribution of Table 
21 has been scaled off in 
this manner, with the re¬ 
sults shown in Table 22. Figure 17 will be helpful in verifying them. 

We will verify here the 34.8% given in Table 22, and the student 
is asked to verify the others in Exercise 2. The range x ± <r (Figure 
17) evidently includes all the variates represented by the two central 
rectangles and proportionate parts of the two adjoining rectangles. 
From 39.50 to 41.94 is 2.44, and since the variates are assumed to be 
uniformly distributed over the class interval we have 172(2.44/4) = 
104.92 for the proportionate number to be excluded in the class 
39.5 — 43.5. Hence the number below x — a is (1 + 14 + 56 + 
104.92) = 175.92. Similarly, from 53.484 to 55.5 is 2.016, and we 
have 156(2.016/4) = 78.624 as the proportionate number excluded 
in the class 51.5 — 55.5. Hence the total above x + a is (78.624 + 
67 + 23 + 3) = 171.624. So the total number outside x zb <r is 
(171.624 + 175.92) = 348 or 34.8% of the 1000 variates. 

Table 22 — Results of Scaling Off Table 21 



x - 47.712 
^ = 5.772 


Range 


Frequency outside the 
given range 


x - i 
x - 2, 
2 - 3< 



Number Percentage 


x + o- = 53.484 
x + 2<r = 59.256 
x + 3<r = 65.028 


X db a 
X ±2a 

5±3<r 


348 

60 

3 


34.8 

6.0 

0.3 














88 


Mathematics of Statistics 


This result could also be obtained as follows: By forming a cum f 
table and interpolating in the end x column we find 

cumf at x — 53.484: 828 

cumf at x = 41.940: 176 

Number in the (x ± a x ) interval: 652 

Number outside this interval: 348 



Fig. 17 — Distribution op Table 21 Scaled Off in Units of <t 


7. Semi-interquartile Range in Terms of <r. The range (Q 3 -Qi)/2 
when expressed in units of <j has a significance in a normal distribu¬ 
tion, as will be shown later. We will denote this by s, hence 

Qa Qi 


2(7 


and 


Q 

s = — 

<7 


For the present we merely calculate its value in the exercises below. 


Exercises 

1. Find the mean and standard deviation for the distribution of Lengths of 

Telephone Calls, given in Table 8 (Chapter I). Use Charlier’s check. 

2. In the three distributions named, show that the percentages outside x -f Ur for 

t = ± 1, ±2, and ±3, are as stated in Table 23. Verify also the values of s. 


Table 23 


Distribution 

N 

Percentage Outside 

s 

x + <r 

X rt 2 o- 

X ± 3<r 

Glasgow girls 

1000 

34.8 

6.0 


0.675 

Telephone Calls 

995 

32.7 

5.0 


0.69 

Span 

2000 

31.8 

4.2 


0.665 











Measures of Dispersion 


89 


8. N Small. Ungrouped Data. When N is small it is seldom de¬ 
sirable to attempt an arrangement of the variates into a frequency 
distribution. Moreover, in this case, the values of a 3 and a 4 are not 
usually needed because the applications of these measures relate to 
characteristics of large distributions. Therefore, only the mean and 
standard deviation are usually required for a small set of ungrouped 
data. The following methods will help the student become familiar 
with the several formulas for a, which may be used in this case. 

Table 24 — Average Yields of Corn in Bushels per Acre 
for a Certain Section in Illinois from 1901-1920 


Year 

Yield (; x) 

u 

u 2 

1901 

21 

— 15 

225 

1902 

39 

3 

9 

1903 

32 

- 4 

16 

1904 

37 

1 

1 

1905 

40 

4 

16 

1906 

36 

0 

0 

1907 

36 

0 

0 

1908 

32 

- 4 

16 

1909 

36 

0 

0 

1910 

39 

3 

9 

1911 

33 

- 3 

9 

1912 

40 

4 

16 

1913 

27 

- 9 

81 

1914 

29 

- 7 

49 

1915 

36 

0 

0 

1916 

30 

- 6 

36 

1917 

38 

2 

4 

1918 

36 

0 

0 

1919 

36 

0 

0 

1920 

35 

- 1 

1 

Totals 

N « 20 

-32 

488 


Method I. The indirect method involving the u unit may still be 
used for finding the first and second moments. Since each variate 
is being treated separately / — 1, and we compute the values of 


v r = 'Ex for r = 1 and 2. 


If the values of x are unequally spaced 




90 


Mathematics of Statistics 


we take c = 1 and let u = x — x 0 which changes the origin but not 
the units. In other words, the procedure is the same as for a fre¬ 
quency distribution except that / = 1 and c = 1. 

Example. Find the mean and standard deviation for Table 24. N = 20. 
We choose x Q = 36. 

Computations: 

32 

v\ = u = — ~ — — 1.6; x = Xq u ~ 36 — 1.6 

— 34.4 bushels. 

488 

vz — —7T — 24.40; M2 = v<l — it 2 — 21.84. 

2U 

Therefore, 

<r x = <?u = ^21.84 = 4.67 bushels. 


Table 25 


X 

1 

II 

z' 2 

21 

— 13.4 

179.56 

27 

- 7.4 

54.76 

29 

- 5.4 

29.16 

30 

- 4.4 

19.36 

32 

- 2.4 

5.76 

32 

- 2.4 

5.76 

33 

- 1.4 

1.96 

35 

0.6 

.36 

36 

1.6 

2.56 

36 

1.6 

2.56 

36 

1.6 

2.56 

36 

1.6 

2.56 

36 

1.6 

2.56 

36 

1.6 

2.56 

37 

2.6 

6.76 

38 

3.6 

12.96 

39 

4.6 

21.16 

39 

4.6 

21.16 

40 

5.6 

31.36 

40 

5.6 

31.36 


688 Zl *'|= 73.6 436.80 




91 


Measures of Dispersion 


Method II. When / = 1, formula (5) becomes 

( ? ) ** = Zfe ~ z) 2 J ' > 


and sometimes it is best to compute the standard deviation directly 
from this definition, without the use of the u unit. Thus the origin 
is placed at the mean and all indirect methods are abandoned. If 
the mean deviation is also desired, clearly this method should be 
used. It is exemplified in Table 25 for the preceding example, and 
the variates have been arranged in order of magnitude. 


688 

x = — = 34.4 bushels 

. 436.80 

2 --= 21.84 


<v = 


20 


o 'x = 4.67 bushels 


MD = — '22 | x’ | = = 3.68 bushels. 


Table 26 


X 

z 2 

21 

441 

27 

729 

29 

841 

30 

900 

32 

1024 

32 

1024 

33 

1089 

35 

1225 

36 

1296 

36 

1296 

36 

1296 

36 

1296 

36 

1296 

36 

1296 

37 

1369 

38 

1444 

39 

1521 

39 

1521 

40 

1600 

40 

1600 


688 


24,104 




92 


Mathematics of Statistics 


Method III. From the relation 


we have 


M2 = v 2 — (vi ) 2 


M2 = I 2>* - x 2 


when / = 1. Therefore <r may be written 

(8) <r* = ^ X> 2 ~ * 2 ] * 


This method is perhaps the best when the values of x are not large or 
when a table of squares is available. It is illustrated below for the 
preceding example. (See Table 26.) 


Computations: 


1 _ 688 , , 
x = — 2-,x - — = 34.4 bushels 

x 2 = (34.4) 2 = 1183.36 

| Ex- ^ = 1205.20 

= [1205.20 - 1183.36] 

- (21.84) 1 / 2 
= 4.67 bushels. 


Miscellaneous Exercises 

1. (a) Verify that the algebraic sum of the numbers in the x' column of Table 25 

is zero. 

( b ) Verify the value of mean deviation given for Table 25. 

2. Using your own judgment as to the most appropriate method, find the mean 

and standard deviation for each of the two sets of data, xi and xz\ 



---- 

Answers 

Xi 

88 

95 

68 

73 

75 

88 

57 

68 

62 

79 

73 

74 

78 

Si = 69.80 

80 

57 

65 

69 

74 

78 

72 

59 

47 

56 

67 

43 


<7 1 = 12.13 

x 2 

82 

86 

75 

78 

72 

79 

63 

65 

67 

75 

68 

70 

79 

x 2 = 67.64 

78 

51 

58 

65 

69 

68 

83 

80 

42 

43 

48 

47 


<r 2 = 12.68 


Measures of Dispersion 93 


3. Complete the computations and find the mean and variance of the following 
distribution: 



Ans. y = 87.31, vy 2 = 56.66 

4 . Data have been gathered showing the points scored on a mental test by 
290 prospective employees and the per cent of standard production 
attained by these same 290 persons after being employed. 1 The following 
statistics were obtained: 

Mental test: mean = 43.33 pts. 

a — 9.25 pts. 

Productive ability: mean = 92.02% 

<r = 24.47% 

(a) Compare the relative dispersion in mental test and productive ability. 

(b) What factors, other than mental level, may have affected dispersion 
under factory conditions? 

6. Read and abstract the article Variability , Journal of Educational Research, 
vol. 4, no. 3, pp. 221-26. 

6 . Find the median for Table 26. 

7 . Find x , <r xy MD, and Q for the following distribution. 


- 1 

mid-x 

2 

4 

6 

8 

10 

f 

1 

4 

6 

4 

1 


8. Show that (8) may be written as follows: 

WE * 2 - ( X >)’]> /2 
= - ~n 

1 Wembridge, Experiment and Statistics in the Selection of Employees , Journal 
of the American Statistical Association, March 1923, p. 605. 







94 


Mathematics of Statistics 


9. If the variates are all equal, say each x t - = k, show that x ~ k and a = 0. 

10. For a set of ungrouped data it is found that N = 15, ]£x = 480, £x 2 = 

15,735. Find x and o*. 

11. Find the variance of the following data. 

5.7 6.2 6.5 6.0 6.3 5.8 5.7 6.0 6.0 5.8 

Ans. a x 2 = .064. 

12. Prove the identity: 

(xx - 5) 2 + (*. - x) 2 + • • • + (x N - x) 2 
= (xi 2 + x* 2 + * * * + x N 2 ) - iVx 2 . 

9. The Standard Deviation of the Combination of Sets. The 

following theorems involving <r are interesting in themselves and 
have useful applications. 

The relation /x 2 = v 2 — v x 2 is true in a more general sense than we 
have previously used. Its generalized meaning will be exposed in 
our first theorem. 

Theorem I. The second moment about the mean equals the second 
moment about an arbitrary point P(xq, O) minus the square of the dis¬ 
tance between the mean and P . 

Stated in symbols the theorem may be clearer. Suppose we have 
a set of N variates whose mean is x. Graphically, x is a point on the 
x-axis. Then if P is any other point on the x-axis, according to 
Theorem I we have 

o) 12> - s) 2 = ^ 2> - py - <* - py. 

To prove this relation we may write 

(x — x) = (x — P) — (x ■— P). 

Then 


- *y -1 T. K* -p)-(x- P)y, 

the right member of which simplifies into the right member of (9). 

The generality of the theorem consists in extending the original 
definitions of v 2 and v\ so that they refer to moments about any point 
P on the x-axis (except x), and not merely about zero. Thus now, 





Measures of Dispersion 


95 


P) 2 . If we take P = 0 we have the original definition 




* = 

of v 2 . Also, when P moves back 
to zero, we see that vi becomes z. 

In other words the original defini¬ 
tions of the p’s are merely the more 
general definitions when zero is the 
value chosen for the arbitrary point. (See (la) of Chapter IV.) 

Theorem II. The sum of the squares of deviations of the variates 
from their mean is less than the sum of the squares of the deviations of 
the variates from any other value . Therefore <r is less than any similar 
“ root-mean-square . ’ ’ 

The proof consists in showing that fx 2 < v 2 which is left to the 
student as an exercise. 

Theorem III. Let there he one set of m variates Xu (i = 1, 2, • • •, 
fti) and another set of n 2 variates x 2 % (i = 1, 2, • • •, n 2 ) and let x he the 
mean of the combined sets (Theorem VIII, Chapter III). The vari¬ 
ance a 2 of the set formed by the combination of these two sets is given by 
the following formula: 


( 10 ) 


F ni nt 

Na 2 = J^(zu - x) 2 + Y^{x 2i — x) 2 

1 1 


where 


N = ni + n 2 . 


Proof: The proof consists in showing that 

m nt ni+«2 

— X ) 2 + Yj{Xii — x ) 2 = (Xi — x ) 2 

111 

which is left as an exercise for the student. 

The above theorem is not very important in itself but it is useful 
in proving the next theorem which gives the relation between the 
variance of a composite set and the variances of sub-sets. 

Theorem IV. Let the frequency , mean, and standard deviation be 
denoted by n u xi> and ai for one set of variates and by n 2 , £ 2 , <r 2 for a 
second set The variance a 2 of the composite set is given by the following 
relation: 

Na 2 = n\a\ 2 + 7i 2 cr 2 2 + nidi 2 + n 2 d 2 2 


where N = ni + n 2 , di = X\ — x } d 2 = x 2 — x, and x is the mean of 
the composite set 



96 


Mathematics of Statistics 


Proof: For the n x set, x may be regarded as an arbitrary point P. 
Hence by Theorem I we have 

1 ni 1 ni 

-ZX x l< — ^l) 2 = — ZX^li - *) 2 - (21 — S) 2 - 

n\ i rii i 

Multiplying through by n x this becomes 

m 

(11) ni<n 2 = ^2(xu — x) 2 — nidi 2 . 

1 

Similarly for the n 2 group we have 

712 

(12) n 2 <r 2 2 = 'Yh(x 2 i — x 2 ) — n 2 d 2 2 . 

i 

Adding (11) and (12), and using (10), we obtain 

n x <j \ 2 + n 2 <r 2 2 = Na 2 — nidi 2 — n 2 /d 2 2 . 

Hence, 

(13) Na 2 ~ n X (Ti 2 + n 2 a 2 2 + nidi 2 + n 2 d 2 2 . 

For k sets combined into a single set we can generalize (13) into 
the following relation: 

k k 

(14) No 2 = Jjn*t + Z>A 2 

1 1 

k 

where N = and di = Xi — x. It is interesting to observe that 

2 fc i 

— J'jngdi 2 is the variance of the means of the sub-sets. Thus we have 
N i 

the important relation 

(14a) a 2 = ^^nitTi 2 + <xz 2 

ri X 

which shows that the total variance may be broken up into two parts, 
one of which is the weighted mean of the variances in the sub-sets 
and the other is the variance of their means. These two parts are 
sometimes called the average variance within classes and the variance 
between the means of the classes. 

Corollary I. Equation (13) may be written in the following form: 


(15) 


Na 2 = m(<n* + xi 2 ) + n 2 (<r 2 2 + x 2 2 ) - Nx 2 . 



Measures of Dispersion 


97 


Proof: Since 

nidi 2 = ni{xi — x) 2 = riiXi 2 — (2niXiX — 

and 

ft 2 d 2 2 = n 2 (x 2 — x) 2 = n 2 x 2 2 — (2 n 2 x 2 5 — n 2 x 2 ) 

the proof consists in showing that the sum of the terms in the end 
parentheses above reduces to Nx 2 . Rearranging these terms their 
sum is 

2x(niXi + n 2 x 2 ) — x 2 (ni + n 2 ), 


which by Theorem VIII (Chapter III) becomes 


2 xNx — x 2 N = Nx 2 . 

Generalizing for k groups, (15) becomes 

k 

(16) N<r 2 = I>,W + z/) - Nx 2 . 

1 

Corollary II. Equation (13) may also be written in the form: 

(17) Na 2 = nm 2 + n 2 * 2 2 + ^ (xi - x 2 )\ 

The proof consists in showing that 

nidi 2 + n 2 d 2 2 = (xi - x 2 ) 2 . 


This is left as an exercise. 

For purposes of computation, (17) may be more convenient than 
either (13) or (15) because it does not require x, but it does not lend 
itself to a generalization for k sets. Generalizations may be useful 
both for computing and theoretical purposes. Formula (14) is par¬ 
ticularly useful in developing the theory of a later section. 

For convenience, the formulas of Theorem VIII, Chapter III, are 
repeated here: 


(18) 

(18a) 


ftiXi + W 2 X 2 
nx + n% 


1 k k 

* N = 

N i i 




98 


Mathematics of Statistics 


Theorem V. Consider k sets . Suppose the second moment of each 
set is taken about the mean , x, of the combined sets . Let v 2 {i) represent 
this moment for the ith set. Then the variance a 2 for the combined sets 
is given by 

(19) N<x 2 = n\vf l) + n 2 vf 2) + ■ • * + n k vf k) = 2 nyf^ 

1 

k 

where ni represents the frequency in the ith set and = N. 

i 

Proof: We may write (10) in the form 

Nc 2 = niv 2 ^ ~f" n 2 v 2 ^ 2 \ 

So generalizing this form of (10) for k sets we obtain (19). 

The next theorem gives the standard deviation of the distribution 
formed by the first N integers, that is, when x = 1, 2, 3, • • *, N. It is 
useful in cases when the variates are recorded not by measurements 
but by their respective positions when ranked in order with respect 
to some character or property. 

Theorem VI. The standard deviation <r N of the first N natural num¬ 
bers is given by 

Proof: By a fundamental definition we have 


i n ri * n 2 

= n£i x2 ~ hviS*. 


and by Theorems IV and V of Chapter III, this becomes 
(tv 2 = KN + 1)(2 N + 1) - i (N + l) 2 
which reduces to 


whence we obtain (20). 

10. Graphical Representation. We have shown that, if certain 
statistics are given for two sub-sets, 






99 


Measures of Dispersion 

the corresponding statistics for the composite set may be obtained 
by means of (13) and (18a). We have been thinking of these statis¬ 
tics as relating to distributions in the ^-direction. The following 
diagrams show how the means and standard deviations of three such 
distributions may be represented geometrically by the points whose 
ordinates are zero and whose abscissas are, respectively, Xi, (xi zb <n); 
^ 2 , (x 2 =b <r 2 ); and x, (x zb <r x ). The points are plotted on three 
different axes to avoid confusion, but they are to be thought of as 
being referred to the same origin and plotted on the same scale. 


Xi -0 *! X 1+0*1 

x a 

T—I-1-1- 

X l a Z X 2 +0 *2 


X+0-* 

It should be clear that Theorems I-IV (§9) will apply to distribu¬ 
tions in the ^/-direction as well as in the ^-direction. In particular, 
it is obvious that (13) and (18a) hold if we replace x by y. Then the 
graphical representation of the means and standard deviations 


Subsets 

Composite set 

ni 


n 2 

N 



y* 

y 

*1 


<T2 

cry 


is shown below. 


y t +ff » 

y_i 

y r ff i 


i 

i 

i 

+ 


y t +ff 2 

y* 

yr*2 


y +<Ty 

y 

y-(*v 


i 

i 



i 

i 

i 


* 







Table 27 


100 





































Measures of Dispersion 


101 


It will be helpful to discuss one more notion in this connection. 
Suppose the y composite set is made up of k sub-sets and the means 
pi, y%) * * * ? Vh, of these sub-sets are plotted on the 2 /-axis ais shown 
by the labels on the left side of the axis below. 


-_-p + 

- y + <r-. 

V 

- V-CTyi 

- Jy-(T V 


We will denote the standard deviation of these means by u y .. 
Then the points y , (y ± <?y .), and (y db <t v ), may be plotted as shown. 
We would expect less variability among the means of the sub-sets 
than among the y ’s of the composite set, that is, that < 7 ^. would be 
less than <r y . It is clear that (14) and (14a) hold when x is replaced 

by V- 

A grasp of these notions will help in the analysis of Table 27 which 
the student is asked to make in problems 5 and 6 below. 

Exercises 

N 1 . (a) Show that pi = £( 2 ,- — P) = (x — P). 

( b ) Derive equations (9) and (13). If m — n 2 , what does (13) reduce to? 

2 . Given the following information about two sets of data: 

11 II 

rii —20 n 2 = 30 

- 25 $ 2 = 20 

<T\ 2 = 5 <T2 2 = 4. 

Find the mean and variance of the composite set. 

3 . Think of the two groups in Exercise 2, page 92, as combined into a single 

set. 

(а) Find the mean of the combined set by formula (18). 

(б) Find the standard deviation of the combined set using result of (a) and 
formula (13). Ans, Z — 68.72, a = 12.45. 










102 


Mathematics of Statistics 


4 . Using Theorem VI find the mean and standard deviation of the first 25 

natural numbers. 

5 . Consider Table 27. Observe that the first and last columns form a frequency 

distribution and that columns (1) to (8) are subdistributions whose totals 
add up to N = 260 which is also the sum of the last column. Let m 

represent the frequency in the ith column and answer the following 

8 

questions: n x = ?, n 4 = ?, n 8 = ?, YL n % ~ ? Let y% and <r* 2 represent 

mean and variance in the ith column. Find the mean and variance of 
each of the columns (1) to (8), first in v units where v — (y — 85)/10. 
Check your answers with those given at the bottom of the table. 

6. Using formulas (18a) and (14) find the mean y and variance <r y 2 of the total 

distribution in Table 27 and check your answers with those given at the 
bottom of the last column. 

Hint. The student will observe that the means, g?», of the columns in 
Table 27 are the values denoted by y in Exercise 3, page 93. The weighted 
mean of these mean values is the mean of the whole table. That is, 
from (18a), 

1 k 

y = t; &d/i 

N 1 
= 87.31. 

The answer 56.66 (Exercise 3) is the variance, <ry 2 } of the means of the col¬ 
umns of Table 27 and is not to be confused with the variance <r y 2 of the whole 

table. In using (14), a 2 is the variance of the whole table, <r» 2 is the vari- 

k 

ance of the ith column, and the expression ndi 2 equals Nay} where 

1 1 

<xy} is the variance of the means of the columns since now d* = Vi — y. 

7 . In Theorem V (§9) show that 

*2 a) = <Ti 2 + d*. 


Hence prove that (19) may be derived from (14) by showing that (14) 
may be written as follows: 

k 

N* 2 = X>(<r* 2 + di 2 ). 

l 

8 . (a) Derive the following relation from (18a), 

x x = — l Nx — 

n x L i—2 J 


What does this formula become when k — 2? 
(b) Derive the following relation from (15), 


= - ["iVV 

n x L 


+ £ 2 ) — « 2 (<r 2 2 + i 



Measures of Dispersion 103 

9. In a certain distribution of N = 25 measurements it was found that 2 = 56 
inches and a — 2 inches. After these results were computed it was dis¬ 
covered that a mistake had been made in one of the measurements which 
was recorded as 64 inches. Find the mean and standard deviation if the 
incorrect variate, 64, is omitted. 

Hint. Let ni = 24, ni = 1. Then X 2 — 64 and <J 2 = 0. 

To find x\ and <n use formulas in Exercise 8 above. 

10. If two or more variates are deleted from a distribution for which N , 2, and <r 
are given, show how to compute the mean and variance of the remaining 
variates. 



CHAPTER VI 


TYPES OF DISTRIBUTIONS. THE NORMAL CURVE 

1. Skewness and Kurtosis. The shapes of frequency distributions 
are not all alike. Unimodal distributions may differ in two ways 
with respect to form. These differences can be described more easily 
if we think in terms of frequency curves. The curve may be quite 




symmetrical, or it may be skew, bulging out on one side more than 
on the other. Secondly, the top of the curve may be narrow and 
peaked, or it may be somewhat flat giving a mound-shape effect. 

The mean and standard deviation are not sufficient to detect these 
characteristics, so we need other measures to describe them. Con- 

104 


Types of Distributions 


105 


Table 28 



A 

B 

C 

u 

/ 

f 

f 

—3 


1 

0 

-2 

3 

1 

1 

-1 

6 

5 

10 


7 

11 

6 

1 

6 

5 

5 

2 

3 

1 

2 

3 

0 

1 

1 

Sums 

25 

25 

25 


-2 


sider, for example, the three distributions of the weights (in class 
units) 1 of different breeds of mice 120-130 days old given in Table 28. 
Experiments on mice are important in cancer research. These dis¬ 
tributions are, however, some¬ 
what fictitious, being adapted 
from some actual data for pur¬ 
poses of illustration. 

The student may easily verify 
that for each of these distribu¬ 
tions we find the same mean and 
standard deviation, namely, 

u = 0 , a u = 1 . 2 . 

One may see from their his¬ 
tograms that these distributions 
are essentially different in shape 
even though they all have the 
same mean and standard devia¬ 
tion. These differences would 
be more pronounced if N were 
so large that the shapes ap¬ 
proached a regular and smooth 
form. Such a large value is 
called the “ population ” or “ universe ” 
we usually have at hand is a “ sample.” 


-3 -2 ~i 


-2 -l 


o 1 
Fig. 20 


— u 


and the value of N that 


1 Neither the original units nor the class interval need concern us here. 






106 


Mathematics of Statistics 


Lack of symmetry in a distribution is known as “ skewness.” 
This characteristic is measured by a 3 . If a distribution is symmetrical 
az = 0, but az may be positive or negative depending upon whether 
the long tail of the distribution extends to the right or the left of the 
mean. (See Figure 18.) 

Figure 19 exhibits curves with different degrees of flatness or 
peakedness. The flatness that we are now describing is in the 
neighborhood of the mode and is not to be confused with the flat¬ 
ness of a curve as a whole which is due to spread or dispersion. 
The curves in Figure 19 all have the same spread. So their flatness 
depends upon the relative amount of material in the vicinity of the 
mode. This characteristic of a curve is called “ kurtosis ” and is 
measured by a 4 . By the calculus it can be demonstrated that a 4 = 3 
for a certain type of distribution which is called the normal curve. 
A frequency curve for which a 4 < 3 is relatively flat-topped and is 
said to have kurtosis. If a 4 > 3 the curve is relatively narrow and 
peaked and is said to lack kurtosis. The value a 4 — 3 for a distribu¬ 
tion is sometimes called its “ excess.” This is zero for a normal 
distribution, positive for a distribution lacking kurtosis, and negative 
for a distribution having kurtosis. The values of a z and a 4 computed 
for an observed distribution are useful in selecting the curve which 
will best represent the type to which that distribution belongs. 1 

Both az and a 4 are abstract numbers and therefore skewness and 
kurtosis in different distributions may be compared by these measures. 
Therefore our definitions are 

f az is a measure of skewness, 

^ ' \ a 4 is a measure of kurtosis. 

For an unsymmetrical distribution the distance between the mean 
and mode may be used to measure the degree of asymmetry or skew¬ 
ness, because the mean and mode coincide in a symmetrical distribu¬ 
tion. Since we wish any measure of skewness to be a pure number, 
we would express this distance in units of the standard deviation, 
thus (mean — mode)/<r. Now it happens that there is a certain 
curve known as Pearson’s Type III which is used to represent certain 
skew distributions, and it can be shown by higher mathematics that, 
for this curve, 

mean — mode a z 
~2 * 


(2) 


<7 



Types of Distributions 107 

So this relation 1 may be used as a formula for obtaining the approxi¬ 
mate mode. 

Exercise 

Find oj 8 and on for each of the distributions A> B, and C, in Table 28. 

2. Frequency Curves. As the student extends his experience he 
finds several types of distributions. It is important in certain prob¬ 
lems to differentiate between them. Differences in type lead to the 
study of frequency curves. There are several standard curves to 
represent the different types of distributions that arise in practical 
statistics. 2 Each of these is specified by a mathematical function 
V — f( x ) where f(x) is a general symbol for any function of x. It is, 
of course, a different expression for each of the different curves. 
Such functions are also called distribution functions. A complete 
discussion of this subject belongs to the field of advanced statistics. 
However, there are some simple concepts relating to frequency 
curves which will be useful in our work. 

If a frequency curve is used to represent a given distribution, the 
total area under the curve corresponds to the total frequency N , 
and therefore the partial area under the curve between the ordinates 
erected at x = a and x = b (Figure 21) represents the number of 



variates with measurement or character between a and b. The limits 
between which the theoretical distribution ranges are denoted by 
h and Z 2 . It is often convenient and causes no loss of generality to 

1 Because of this relation some writers use a 3 /2 as a measure of skewness 
instead of <* 3 . Also some authors adopt a different convention as to sign, defining 
skewness as negative when the mean is greater than the mode. 

2 See Chapter III, Part II. 



108 


Mathematics of Statistics 


suppose that the total area under the curve is unity or 100%, in 
which case the partial area between a and b represents the percentage 
of variates having the given character. 

In mathematical language the “ area under/(x) between a and b ” 
is called the “ integral of f(x) from a to b,” and is denoted by the 
symbol 



However, we will abbreviate this symbol and use merely 



to de¬ 


note such an area. 

Without attempting to be rigorous, we may say that the total area 
under the curve is the limit of the area of the appropriate histogram 
whose rectangles have bases Ax and altitudes f(x), as Ax is taken 
smaller and smaller and approaches zero. Thus 



dx = lim S/(z) Ax. 

Ax —►() 


The integral sign J is a conventionalized S and denotes the s um 
of elements of area with bases dx and altitudes y = fix). The letters 
written at the top and bottom of denote the range over which 

the sum is to be taken. Therefore the notation f y dx or f f(x) dx 

da da 


represents the area which is bounded by the curve y = /(#), the 
ordinates at x = a and x = 6, and the x-axis. (Figure 21.) 

The integral of y = f(x) from h to l 2 denotes the total frequency N. 
Therefore, 


N = 



Hence, the proportion of variates having some character x } such that 

1 C h 

a ^ x < b, is given by —r / . If N is taken as unity or 100%, then 

iV da 

J denotes the percentage of variates having the gh 

a 


given character. 


In more advanced work this symbol also denotes the probability 
that a variate chosen at random from the universe y = f(x) will 
have a value between a and b. 




Types of Distributions 


100 


3. The Normal Curve. Perhaps the most important of all fre¬ 
quency curves is the so-called normal 1 curve whose equation is 

N 

(3) y = —— e -(*-*) 2 / 2 o\ 

<rV 2 tt 

It is a bell-shaped curve symmetrical about x = x. It was first 
discovered by a famous French mathematician, De Moivre, about 
two hundred years ago and published in 1733. He obtained it while 
working on certain problems in games of chance which were proposed 
to him by the gamblers of his day. Because of this origin and because 
the data from certain coin- and dice-throwing experiments closely 
approach it in form, it is often called the normal probability curve. 
Actual statistical use of the normal curve began with the work of the 
famous mathematical astronomers, Laplace (1749-1827) and Gauss 
(1777-1855), each of whom derived it independently and presumably 
without knowing of De Moivre’s treatment. They found that it 
represented very well the errors of observation in the physical sci¬ 
ences. 2 For this reason it has been called the normal curve of error, 
where error is used in the sense of a deviation from the true value. 
Since that time experience has shown that it serves quite well to 
describe many of the distributions which arise in the fields of biology, 
education, and sociology. Much of the theory of statistics is built 
around it. 

4. Standard Form. The letters e and ir in equation (3) are fixed 
constants, e = 2.718 and w = 3.1416, approximately. But the 
quantities N, x, and cr are constants which vary with different distri¬ 
butions. Such constants are called parameters. Here they deter¬ 
mine the size, position, and spread of the curve but do not have 
anything to do with the fundamental characteristics of the curve. 
In order to study these characteristic properties it is convenient to 
represent the curve by an equation which will be independent of the 
parameters; in other words, to eliminate them from the equation by a 
transformation. This is accomplished by considering the total area 
under the curve as unity, taking the origin at the mean, and using 
the standard deviation as the unit of horizontal measurement. 
In mathematical language this means that we set N = 1, and 

1 The term “ normal ” used here should not be interpreted to mean that other 
types of distribution are abnormal. 

2 For a discussion of the normal curve in connection with the theory of errors 
see Davis and Nelson, Elements of Statistics , p. 206. The Principia Press, Inc. 



110 


Mathematics of Statistics 


t = (x — x)/<r x . We will denote the resulting function by </>(£), that is, 

( 4 ) 

which is called the standard form of the normal curve. 

A variable, t, which is distributed in accord with (4) is said to be 
normally distributed with mean zero and unit standard deviation . 

Just as coordinates of points on the curve are denoted by ( x , y) 
in the case of equation (3), so in equation (4) t refers to abscissas and 
<j>(t) refers to ordinates. The relation between the two systems of 
coordinates is given by 

(5) x = ter + x 

for abscissas, and 

( 6 ) y = *Ht) 

for ordinates. Equation (6) follows from (3) and (4). If the area 

under the curve is taken as unity, then y = ~4>(t), that is, </>(0 — ay. 

<r 

This says that since the abscissas are compressed by a in changing 
from arbitrary units into standard units, so the ordinates must be 
stretched by a if the area under the curve is to be the same in the two 
scales of measurement. 

5. Tables of Standard Ordinates and Areas. One of the reasons 
for writing the equation in standard form is that the ordinates and 
areas may be tabulated once and for all. These tables are given in 
the Appendix. We see from (4) that <t>(—t) = <t>(+t), i.e ., the ordi¬ 
nates for negative values of t are the same as for the corresponding 
positive values of t, and the curve is symmetrical about the ordinate 
at t = 0. Therefore it is necessary to tabulate values of <j>(t) for 
positive V s only. Equation (4) may be graphed by plotting the 
points corresponding to a few well chosen values from the tables 
and drawing a smooth curve through them. (Figure 22.) 

The curve approaches very close to the horizontal axis at each 
extremity but is asymptotic, that is, it does not quite touch the axis 
no matter how far extended. We say its limits are at — oo and 
+ oo. Although the infinite abscissal range is never met in practice 
it may be characteristic of the “ universe ” from which a given 
distribution is a sample. Therefore, this infinite feature is useful in 



Types of Distributions 


111 


ly <P 


j s 

^4 

I /' 

• 3 \ 

I / - 

-.2 \ 

I J - 

1 \ 

! —< 

i _ 1 1 " c 

1 1 | Sv — t 

1 . -3 -2 -1 ] 

12 3 

x ' > 



Fig. 22 


theoretical investigations. Moreover, even in representing observed 
distributions the infin ite 

r+\ 

range causes no practical 
difficulty because the curve 
comes down to the hori¬ 
zontal axis very rapidly 
beyond t = ±3. The com¬ 
bined area at each extrem¬ 
ity beyond t = ±3 is only 
.27 of 1% of the total area 
under the curve. 

Partial areas between 
ordinates erected at vari¬ 
ous values of t, say between t = a and t = b, are denoted by f . Thus 

*)a 

the area from t = 0 to t = 1 is 
given by = .3413. Since the 

total area under is taken as 
unity the area on either side of 
t = 0 is 0.5 and it is only neces¬ 
sary to tabulate the areas J* for 

positive values of t. Thus the area from t = -1 to t = 0 is equal to the 
area from t = 0 to t = 1. In sym¬ 
bols this would be stated as follows: 



/:=/:■ 


Any other areas required may be 
found by an appropriate addition 
or subtraction of tabular values. 

For example, suppose the area 

below t = —2 is required. This is denoted 



» ,r. 

1 /— oo 


from —oo to —2 equals 0.5 minus the area from — 2 to 0. 
area from —2 to 0 is the same as from 0 to 2. That is, 

2 


Now the area 
And the 


/;--£■ 
•••/. 


But 


X O /*2 

2 JO 


.4772. 


= .5 - .4772 = .0228. 





112 


Mathematics of Statistics 


Both areas and ordinates for decimal values of t between tenths may 
be approximated by interpolating between the values given in the 
tables. 

The illustrative examples following §6 will help the student 
become familiar with the tables. He should verify the answers and 
draw a simple sketch of the curve showing the ordinates or areas in 
each case. 


The symbol f denotes a cumulative relative frequency, i.e., 
the percentage of the total frequency N which is less than t. In order 
to find values of / from the tables, for assigned values of t, the 

— co 

student should observe (from a figure) that 



the plus or minus sign to be used according as t is positive or negative. 

6. Properties. A knowledge of the properties of the normal curve 
is essential for an intelligent use of the curve in practical statistics. 
A demonstration of some of these properties is beyond the scope of 
the present discussion although quite simple in the calculus. The 
following properties are the most important and interesting. 

1. The mean, median, and mode coincide at t = 0. The height 

of the maximum ordinate in standard form is 1/V27T because when 
t = 0, = 1/V27T - .3989. 

2. Since the standard deviation is the unit of measurement along 
the horizontal axis, <r x = 1 in the t scale. Any t value may be con¬ 
verted into the corresponding x value by (5). In the vertical direction 
N/<t is the unit of measurement and any 4>(t) ordinate may be con¬ 
verted into y units by means of (6). 

The area under (3) in the range from x — c to x = d is denoted by 



If t = a and t = b denote the corresponding range in standard units, 


then 



denotes the corresponding area, 


in standard units, under (4). It is 



Types of Distributions 113 

shown in the calculus that dx — <y x dt. Therefore from (6) we have 


4>{t) dt. 

a 


y dx = N 


If the interval goes from x = c to x = d, (7) says that the 


where 


Frequency over (c, d) = nJ 


a = (c - x)/(r xy b = {d - x )/<r«. 


This merely means that the percentages (relative frequencies) ob¬ 
tained from the tables may be converted into numbers (frequencies) 
by multiplying the percentages by N. 

3. The curve changes from concave to convex at t — =fcl. In the 
z-scale, referred to the origin of x, these points are at x = x zb <r x . 
They are called points of inflection and their position is important 
in making an accurate drawing of the curve. 

4. The standard deviation is approximately 25% greater than 
the mean deviation. More precisely, MD = .798<r. 

5. The quartiles, Qi and Q 3 , are equidistant from t — 0 and there¬ 
fore from the mean. By definition 

Qz is that value of t for which f = .75, 

i.e., for which = .25. From the tables this is t =■ .6745. There¬ 
fore in arbitrary units, 

Q 3 = X + .67450-* and Qi = J - .6745c-,. 

6. The quartile deviation (semi-interquartile range) for a normal 
distribution will be denoted by E. Its value is 

„ Qz - Qi (* + .67450-) - (x - .67450-) _ 

E = —— - - - = .67460-. 


In standard units this is s = E/a = .6745. 

7. The quantity E (or s) has a significance in probability theory. 
If a variable x is distributed according to the normal curve, the 
probability is one half that a variate selected at random will have a 
value between x — E and x + E. The reason for this statement is 


114 


Mathematics of Statistics 


that 50% of the variates have values within this range. E is com¬ 
monly, though somewhat ambiguously, called “ probable error.” 

8. E is in units of x whereas s is a value of t, that is, s is the value 
t = .6745, and E is the value x = .6745 <j x . Just as a x may be used 
as a yardstick in scaling off a distribution on either side of the mean 
(§6, Chapter V), so may E or s be used in a similar manner. When 
thinking of them in this way it is useful to regard E as a yardstick 
about two-thirds the length of <r*. The following table gives the 
end-points of certain intervals in t, x', and x units, respectively, where 
t = x'/<t x and x 1 = x — x. 


End Points of Certain Intervals in t, x f , x 


When a is the unit 

When E is the unit 

t 

x’ 

X 

t 

x' 

X 

0 

0 

X 

0 

0 

X 

±1 

zt<T 

X ± <r 

± .6745 

± .6745o- 

X ± .67450- 

±2 

±2o- 

X ± 2o 

±1.349 

±1.349<r 

3 ± 1.349o 

±3 

±3o- 

x ± 3o 

±2.023 

±2.023o- 

x ± 2.023o 


The percentage distribution of area under the normal curve is 
given (approximately) in Figure 23 where a x is the unit of measure¬ 
ment along the horizontal axes and in Figure 24 where s is the unit. 




Fig. 23 Fig. 24 


The percentages given in the figures may be regarded as abridged 
tables. Of course the tables in the Appendix will ordinarily be used 
in problems. 

With reference to Figure 23, it is sometimes said that if values of x 



Types of Distributions 


115 


are normally distributed, the probability that a value chosen at 
random will fall within the range x x < x < x 2 , where Xi = x — <r x 
and x 2 = x + <r x , is .68. 

9. There are other forms of the equation of the normal curve. 
If the origin is taken so that the x coordinate of the centroid of area 
under the curve is zero, then (3) becomes 


V = 


N 

<rV 2t 


e ~^\ 


Physicists and astronomers sometimes prefer the form 



-h?z 2 


The shape of the curve, whether relatively steep or relatively flat 
(in the sense of dispersion not kurtosis) depends upon the constant h. 
It can be proved by more advanced mathematics that if a set of N 
variates is distributed in accord with the normal curve, then h = 
l/(V2<r x ), where <r x is computed from the given N variates. Obvi¬ 
ously the smaller the value of <r x the larger is h. For this reason h is 
called the “ index of precision.'' For a given value of N, the curve 
is relatively steep when <r x is relatively small, i.e., when h is relatively 
large. 

10. The curve is symmetrical and <x 3 = 0. The fourth moment 
about the mean is equal to three times the square of the second 
moment about the mean, i.e., ju 4 = 3 /* 2 2 and therefore a 4 = ju 4 /V =3. 


Examples 

1. Find the ordinates of for (a) t » 2.3, (6) t = -2.3, (c) t = .67. 
Solvtwns from the Tables in the Appendix: 

(а) 0(2.3) - .02833 

(б) 0(—2.3) == .02833 

(c) 0(.67) » .31874 


2 . Find the following areas under and use the integral notation: 

(a) From t = 0 to t = 3.00 

(b) From t * 1.5 to t = 2.5 

(c) From t ~ —2 tot = 1.3 

( d) From t = 0 to t = .6745 


Solutions from the Tables: 


(a) The required area is given by 



which we find to be .49865. 


116 


Mathematics of Statistics 


'1.5 


.43319, and from t = 0 to 


(6) The area from t = 0 to t = 1.5 is £ 

J r* 2.5 /*2.5 /»: 

= .49379. Therefore the required area is / = / 

0 t/1.5 Jo 


J r»i.5 

o 


.0606. 


(c) Since the area from £ — 0 to £ = —2 is the same as from t — 0 to t = 
+2 we have 

/ 1.3 f*2 /U.3 

= 1+1 =.47725 + .40320 = .88045. 

-2 Jo Jo 

(d) Here we'must interpolate: 

t = .67, 


For 

For 

For 

Therefore 

whence 


.24857 


t = .6745, 
t = .68, 


I; 

= A (say) 
jT*= .25175. 


A - .24857 _ 

.25175 - .24857 ' 

A = .25. 


.0045 

.01 


3 . Show that for equation (3), the percentages of area outside the given ranges 
are as stated below: 

Above x + a = 15.87% 

Outside x ± a = 31.74% 

Outside 5 ± 2cr = 4.56% 

Outside $ ± 3<r = 0.27% 

Solution: Converting these ranges into t units, and remembering that only 
the positive half of the area under <j>(t) is tabulated and equals .5, we have 

Area above t = 1 is .5 — 1 = .1587 

Jo = 15.87% 

Area outside t — ±1 is 2(15.87%) = 31.74% 


Area outside t = ±2 is 2 ^.5 — f ^ = .0456 
V j0 ' 4.56% 

Area outside t = ±3 is 2 ^.5 — ^ ^ = .0 


0027 

.27%. 



Types of Distributions 


117 


Given N — 1500, x = 75, c x = 10. If the variates are distributed according 
to the normal curve, (a) find the value of x for which cum f = 800, (6) for 
which cum f = 450, (c) how many of the N variates lie where x < 80? 
Solutions: 


(a) By definition, cum f 


and from (7), 


ue.. 


But 


-L 

L-”L 

800 = 1500 j'* 
J = .5333. 

S-.~* + £ 

■£-« 


whence from the tables, 


.0333 


t = .083. 


Substituting in equation (5), 


x = 75.83. 


(5) We have C — 45/150 = .3 and t is negative. 

l/ — CD 

L=>-£ 


Since 


£=> 


we have 

whence we find that 

t = —.524 

so x ~ 69.76. 

(c) From the relation / = (x - x)/a x we find that 
t — .5 when x = 80. 

From the tables, 

*5 

.69146. 




From (7) we have 


£ - 


1500 (.19146) 


= 1037.2. 


118 


Mathematics of Statistics 

Exercises 


1. Find <#>(2.65) <#.(-1.46), 0(0). 

2. Find t if <#>(#) = .1257, .0325, .0034, respectively. 

3. Find the following areas under and draw a figure in each case: 


(o) 

( 6 ) 


r, r, r, r. r 

Jo J- 1.2 J- oo J 1.2 %)- 1.2 

/».37 /\6745 

t/ —.37 * J-.6745 


4. Find given the partial areas: 



.27457, 


£ 


.99730. 


5. Verify the percentages given in Figures 23 and 24. 

6. (a) How far from the median of a normal distribution is the first quartile? 

(6) In a certain normal distribution x — 89 and Qi = 75.51. What is 0 ®? 

7. For a normal distribution : N = 1000, x = 20, <r x = 2. 

(а) What is E? 

(б) Find the value of Qz. 

(c) What values of x will include the middle 500? 

(d) The middle 75%? 

8. If N = 300, x ~ 75, <t x = 15, for a normal distribution: 

(a) What is the value of the first quartile? 

( 1 b ) The third quartile? 

(c) How many variates are between x = 60 and x — 90? 

9. In a college the 8 grades A, A—; B, B —; C, C-; D, and F are given. 

On the assumption that mathematical ability is normally distributed, 
how many out of a total of 1000 should receive each grade? Assume 
that x is the boundary between the C and B — grades and that each grade 
interval is .8<r. What range in standard units on either side of x is thereby 
assumed to include all the grades? 

10. What are the percentages of a normal distribution outside x for 
t = 1, 2, 3? 

7. Curve Fitting. It should be remembered that data collected 
and presented in the form of a frequency distribution is merely a 
sample of a general type called its universe. Other samples from 
that universe might yield somewhat different frequency distributions. 

For certain purposes it may be desirable to fit a normal curve to 
a unimodal distribution which is reasonably symmetrical and appears 
to be of the normal type. The theoretical curve idealizes the recal¬ 
citrant observational data and smooths out the irregularities due to 
sampling fluctuations. Thus, we may compare the theoretical fre¬ 
quencies with the observed. 



Types of Distributions 


119 


Another reason for fitting a curve to observed data is that the 
equation of the curve provides an efficient device for preserving 
those data. If the curve gives a satisfactory representation, then 
the observed data may be discarded and reproduced, at will, from 
the equation. 

In fitting equation (3) to a given distribution, we assume that 

(1) The given frequency N represented hy a histogram equals the area 
under the curve , and 

(2) The mean and standard deviation of the observed distribution 
equal , respectively , the mean and standard deviation of the theoretical 
distribution represented by the curve. 

The term “parameter” is often used to denote some characteristic of 
a theoretical distribution. For example, x and o in (3) are the 
parameters in a normal distribution. An appropriate estimate of a 
parameter by the use of a function of observed data is called a 
statistic. Assumption (2) means, then, that we replace each of the 
parameters by the corresponding statistic. 

Thus, for the equation of the normal curve which fits as closely 
as possible the distribution of the weights of Glasgow schoolgirls 
(Table 21) we substitute 

x = 47.712 
<t % = 5.772 
N = 1000 

in equation (3), and obtain 

1000 (s—47.712) 2 

y = -— e ~ 2(5. 772)2 . 

5.772 V2tt 

To make use of a table of standard ordinates in graphing this 
equation we transform it into standard units by setting 

x _ 47 712 

(а) t = —— = .17325# - 8.2661 
and write 

(б) y =-<t>(t) = 173.25^(0. 

<r 

Appropriate values to assign x in equation (a) are the end-x and 
mid-x values of the given distribution. The use of a computing 
machine in changing x values into corresponding t values is explained 


120 


Mathematics of Statistics 


Table 29 — t = .17325x - 8.2661, y = 173.25<£«) 


X 

t 


y 

f/c 

27.5 

-3.502 

.00086 

.15 


29.5 

-3.155 

.00275 

.48 

0.25 

31.5 

-2.809 

.00772 

1.34 


33.5 

-2.462 

.01927 

i 

3.34 

3.50 

35.5 

-2.116 

.04253 

7.37 


37.5 

-1.770 

.08329 

14.43 


39.5 

-1.423 

.14494 

25.11 


41,5 

-1.076 

.22361 

38.74 

43.00 

43.5 

-0.730 

.30563 

52.95 


45.5 

-0.383 

.37072 

64.23 

61.25 

47.5 

-0.037 

.39866 

69.07 


49.5 

0.310 

.38023 

65.87 

65.75 

51.5 

0.656 

.32230 

55.84 


53.5 

1.003 

.24124 

41.79 

39.00 

55.5 

1.350 

.16038 

27.79 


57.5 

1.696 

.09893 

17.14 

16.75 

59.5 

2.042 

.04960 

8.59 


61.5 

2.389 

.02299 

3.98 

5.75 

63.5 

2.735 

.00948 

1.64 


65.5 

3.082 

.00346 

0.60 

0.75 

67.5 

3.428 

.00111 

0.19 








Types of Distributions 


121 



Fig. 25— Normal Curve Fitted to Histogram Representing Weight 
Distribution of Glasgow Schoolgirls (Table 21) 

The smooth curve is plotted from the points (x, y) given in Table 29. The 
column headed f/c in that table gives the heights of the rectangles in the histo¬ 
gram, c = 4. When both the curve and the histogram are to be drawn, it is best 
to draw the curve first so that the presence of the histogram will not prejudice 
one into trying to make the curve fit the histogram. 



122 


Mathematics of Statistics 


in §6, Chapter IV. Thus we obtain the values in the second 
column of Table 29. We may then enter the tables in the Appendix 
for the corresponding ordinates, <f>(t). These are converted into y 
values by equation ( b ). The curve may then be drawn by plotting 
the x and y values. (Figure 25.) The curve should be drawn so 
as to be symmetrical with respect to the ordinate at the mean and 
its points of inflection should be at a distance from the mean equal 
to a. The student should observe that every pair of (x, y) values 
computed in Table 29 furnishes two points for the graph, each 
symmetrical to the other with respect to the mean ordinate. Both 
points should be used in drawing the curve but only the com¬ 
puted point should be left permanently on the 
graph. 

f= area After the curve is drawn, the histogram for 
y c f bsse the observed data may be constructed. The 

f/c = height co i umn headed f/c gives the heights of the 

_1 rectangles on the same scale as the ordinates of 

the curve. 

8. Graduation. The areas under the fitted curve and over the 
class intervals are called theoretical frequencies. Thus in Figure 25 
the shaded area represents the theoretical frequency corresponding 
to the observed frequency which is represented by the rectangle the 
mid-point of whose base is 41.5 pounds. The determination of the 
theoretical frequencies is called “ graduation by the normal curve.” 
It is a process of smoothing out the data to fit the curve. The method 
is shown in Table 30 for the data represented by Figure 25. 

In order to enter a table of standard areas we must change the 
end-x values into t values. These are given in the third column of 
Table 30. They are part of the values already computed for Table 29. 

The entries in the column headed A = I are the (cum f)/N 

%J — co 

values of the standard curve for the given end-points. The entries in 
the column headed A A are obtained by differencing the preceding 
column. (See last paragraph of §9, Chapter I.) They are the per¬ 
centages f/N to be expected in the various intervals on the hypothesis 
of a nor mal distribution. Therefore NAA gives the numbers to be 
expected, that is, the theoretical frequencies. 

The student should study this table until he becomes familiar with 
all the operations involved and what they mean. He should distin¬ 
guish between the purposes of Tables 29 and 30. 




Table 30 


Observed 

Frequency 

Boundary 

X 

i 

8 

M 

II 

A A 

naa = 

Theoretical 

Frequency 


— oo 

- OO 

.0000 



1 




.0025 

2.5 


31.5 

-2.809 

.0025 



14 




.0147 

14.7 


35.5 

-2.116 

.0172 



56 




.0602 

60.2 


39.5 

-1.423 

.0774 



172 




.1553 

155.3 



-0.730 

.2327 



245 




.2527 

252.7 


47.5 

-0.037 

.4854 



263 




.2587 

258.7 


51.5 

0.656 

.7441 



156 




.1674 

167.4 


55.5 

1.350 

.9115 




! 



.0679 

67.9 


59.5 

2.042 

.9794 



23 




.0175 

17.5 


63.5 

2.735 

.9969 



3 




.0031 

3.1 


oo 

oo 

1.000 



Totals 




1.0000 

1000.0 


123 








124 


Mathematics of Statistics 


9. Purpose of a Graduation. If, for the distribution of graduated 
frequencies, the mean, standard deviation, and total frequency be 
found, their values will be precisely those of the corresponding mo¬ 
ments in the observed frequency distribution. This must be so, be¬ 
cause these were the conditions postulated in the process of gradu¬ 
ation. Moreover, the observed values of skewness and kurtosis as 
given by a 3 and a 4 will not differ appreciably from the theoretical 
values if the fitting of the normal curve to the observed distribution 
was justified. 

Since the above parameters characterize a distribution, the ob¬ 
serving student may wonder why a distribution should be graduated 
if the values of these parameters are unaltered in the process. 

There are three main reasons why a student should be taught to graduate a 
curve. The first, and least important, has to do with the use of a smooth curve 
in place of a jagged sample. The second, and most important, is that it is 
necessary for the mathematical development of statistics that the mathematician 
should be told what assumptions he may make. These usually depend on the 
types of frequency curves which can be depended on to fit phenomena. . . . 
A third reason, intermediate in importance between the other two, is that in 
testing a priori theories in various fields, it is often necessary to test the efficacy 
of the frequency distributions which are results of these theories. 1 

The second and third of the above reasons may seem somewhat 
abstruse, but it is not easy to give completely satisfactory explanations 
of them at this stage of the student’s development. About all we can 
say at this time is that the distribution of variation of a variable x 
about its mean value is a fundamental statistical concept and in 
certain theoretical investigations it is very important that we have 
mathematical functions which are capable of representing such 
distributions. This is particularly true in sampling theory which 
will be discussed in Part II. 

The first reason is more readily understood. Occasionally in 
practical problems it may be desirable to use the theoretical fre¬ 
quencies obtained by graduation in place of the observed data which 
probably contain irregularities due in part to grouping, in part to 
sampling fluctuations. We cite here two illustrations. 

Example 1. A company which operates a chain of men’s haberdashery stores 
planned to bring out a new line of about 100,000 light weight sport shirts suitable 
for camping, hunting, etc. The question arose as to the determination of the 
number of each size that should be ordered from the factory. Their previous 

1 Journal of the American Statistical Assoc., vol. XXVI, March 1931, Supple¬ 
ment, p. 36. 




125 


Types of Distributions 

distribution of sizes had not been satisfactory because the demand for certain 
sizes had been different from the number manufactured. Therefore the statistical 
department was requested to recommend the distribution of the proposed order 
according to neck sizes. The solution of the problem hinged upon the availa¬ 
bility of data giving the measurements of neck circumferences of a large sample 
of men. Satisfactory data were found in the “ Reports of the Medical Depart¬ 
ment of the United States Army in the World War,” which gave a table of the 
neck measurements in centimeters of 95,102 white troops at demobilization. 
Since these data are tabulated in class intervals which are slightly different from 
the ranges used in standard shirt-band sizes, a slight adjustment was necessary. 
But essentially a normal curve was fitted to this distribution and the graduated 
frequencies were taken as the number of potential customers for each shirt size. 
The result was quite satisfactory. 

Example 2. A well known and interesting illustration of the desirability of 
smoothing occurs in the census returns. The census takers’ records show more 
persons alive at age 30 than at age 29, more at age 35 than at age 34, more at 40 
than at 39, etc. This is probably due to the fact that men (as well as women) 
do not tell their exact ages. A person who is actually 41 or 42 and known to be 
40 or so, says he is 40. The recorded data show artificial bumps at every age 
which is a multiple of 5. Naturally the Census Bureau prefers the smoothed 
results to the observed. The student should not infer that the curve used to 
smooth these data is the normal type. The u life curve ” is a continuously de¬ 
creasing function. However, the same kind of quinquennial irregularity occurs 
in other actuarial data which do approximate the form of a normal curve. Many 
examples are given in Elderton, Frequency Curves and Correlation . 

10. Probability Paper. The cumulative frequencies for the normal 
<K0 curve are given by A - / .As t varies from - oo to + oo 

1/ — CO ' 

A varies from 0 to 1, and for the finite range t = rfc 3 (commonly met 
in practice) A varies from .00135 to .99865. (Verify.) Regarding A 
as a function of t, values of (t, A) from the tables may be plotted and 
the resulting points joined by a smooth curve. 

When graphed on an algebraic scale this curve is the ogive of the 
normal curve. It is also called the integral curve of <£(£). As indi¬ 
cated in Figure 26, the ordinate of the ogive is zero at t = — oo ; 
•5 at t = 0, and the ogive approaches the line A = 1 asymptotically. 

Now imagine the vertical scale of Figure 26 stretched in such a 
way that the ogive becomes a straight line. The stretching required 
will be greatest around the line A = .5 and gradually diminish as 
the distance from this line increases. 

Paper so ruled that the (t, A) graph is a straight line is called 
probability paper . It is readily obtainable and is convenient for many 
purposes. Thus, by plotting cum f for an observed distribution on 
probability paper, one may observe how closely it approximates a 


126 


Mathematics of Statistics 


straight line and hence get an idea of how nearly normal it is. One 
may thus locate"graphically the median, quartiles, etc., and estimate 
frequencies between given limits. 



Fig. 26— Ogive of the Normal Curve 


A more complete discussion giving references to writers who sug¬ 
gested and developed the use of probability paper may be found in 
the Journal of the American Statistical Associationj vol. XXVI, June 
1931, p. 178. 


Exercises 

1. Construct three normal curves on the same axes according for the following 
specifications. Compute ordinates at intervals of from the mean in 
the range x =b 3<r. 


Curve 

<Tx 

X 

N 

A 

10 

50 

400 

B 

10 

50 

800 

C 

10 

50 

1200 


Suggested form for computations: 




Types of Distributions 


127 


2. Construct three normal curves on the same axes according to the following 
specifications. Compute ordinates at intervals of .5a from the mean. 


Curve 

a x 

X 

N 

A 

15 

50 

1000 

B 

10 

50 

1000 

C 

5 

50 

1000 


Suggestion: 



Observe that: 

Vc = 20Cty (<) 

Vb = \Vc — 1000(0 

, 200 , 

Va = iVc — 0(0* 

3. Verify the entries in Tables 29 and 30. 

4. For the following distribution: 

(a) Find the equation of the best fitting normal curve, and plot the curve 
and histogram. 

(i b) Find the graduated frequencies. 


mid-x 

2 

4 

6 

8 

10 

f 

1 

4 

6 

4 

1 


5. Graduate the distribution in Table 8, §10, Chapter I. Also find the ordi¬ 

nates of the best fitting normal curve and plot the curve and histogram, i 

6. A distribution of the weekly wages of 906 anthracite miners showed the 

following results: 

x = $36.13 « 3 = .007 

a x = $8.87 a 4 = 3.02 

Assuming a normal distribution, estimate the number of the 906 miners 
who received weekly wages (a) in excess of $45, ( b ) less than $25. 

7. An urban electric railway company operating a large city subway uses 

thousands of electric light bulbs in its underground stations. On January 
1, 1934, the company put into service 5000 new light bulbs. Let it be 







128 


Mathematics of Statistics 


assumed that these 5000 bulbs will have a mean life of 50 days, a stand¬ 
ard deviation of 19 days, and that their lives conform to the normal 
curve. 

If January 1st be counted as a full day in the life of the bulbs: (a) How 
many bulbs would have had to be replaced by midnight January 31, 1934? 
(6) How many by March 10, 1934? 

8. Which properties of the normal curve may be used as criteria in passing 

judgment on the normality of an observed distribution? Would you say 
that the distributions referred to in Table 23 are approximately normal? 

9. Graph the ogive of the normal curve by plotting values of ( t> A) in the range 

t = ±3, (a) on an algebraic scale, (6) on probability paper. 

10. What famous mathematicians’ names are associated with the normal curve? 

When did these menTive? Which of them should most appropriately be 
credited with the discovery of this curve? 

11. (Camp.) The standard deviation of a certain set of 100,000 high school 

grades was 11%, and the mean grade was 78%. Assume the distribution 
to have been normal, and, being careful not to confuse percentage in the 
sense of grade with a percentage of frequency, answer the following ques¬ 
tions: How many grades were (a) above 90%, (6) below 70 %? (c) What 
was the highest grade of the lowest 1000? ( d ) Within what limits did the 
middle 90,000 He? ( e) What was the semi-interquartile range? 

12. (Camp.) Answer all the questions of Exercise 11 with reference to a set of 

100,000 grades in which the median was 83% and Q 3 was 90%. Also 
find <r x . 

13. In a certain normal distribution, N = 1000, x = 50, a x - 10. For this 

distribution: 

(a) Convert the following x’s into the corresponding V s, 


X 

15 

20 

25 


35 

40 

45 


55 

60 

65 


75 

80 

85 

t 

















(6) Find from the tables the values of <f>(t) for the t values in (a). 

(c) Convert the <f>(t) values obtained in (a) into y values. 

(d) Plot the (x, y) values in (a) and (c) and draw a smooth curve through 
them. 

(e) Find the cumulative relative frequencies, A = f , for the values of t 

«/— 00 

in (a). 

(J) Difference your results in (e) by finding A A. 

(g) Convert the percentages in (/) into frequencies. 

(h) Explain the meaning of your results in (g) with reference to the figure 
for (d). 

(i) Find the number of variates between x = 42 and x = 74. 

(j) Find the values of x for which cum f = 250, 600, 750, respectively. 




Types of Distributions 129 

14. Given a normal distribution in which N = 800, x = 40, <r x = 7. Find the 

numerical value of each of the following. 

Qb Q 2 1 Qz t E, N f 

t /<=0 

15. Suppose N = 5000 variates are normally distributed such that 5 = 50 and 

E = 13.49. Without using the tables find the value of the following: 
quartiles, median, mode, standard deviation, mean deviation, x for which 
cumj — 1250. 



CHAPTER VII 
CURVE FITTING 

1. Empirical Expressions. The preceding chapters have dealt 
with the description and characterization of frequency distributions. 
We have considered three general methods of description: (1) graphi¬ 
cal devices, (2) the method involving calculation of averages and 
measures of dispersion, (3) the method which is sometimes called 
analytical. This latter method consists in describing the distribution 
by an equation, and we considered only one such analytical expression, 

the normal curve. 

Example 1. Expectation of Life 1 at various However, another branch of 

statistics is concerned with 
data which may not be classed 
under frequency distributions, 
but which may be described 
by simple equations. 

When one variable is a func¬ 
tion of another in applied 
mathematics the mathematical 
relation between them is not 
always known. As we men¬ 
tioned in Chapter II, the only 
information regarding this 
functional relationship may be 
a set of pairs of values obtained 
by experimental or observa¬ 
tional means. These pairs of 
values may be regarded as 
coordinates of points and plot¬ 
ted. In doing so, the values 
of the variable which is regarded as independent are taken as 
abscissas, and those of the dependent variable as ordinates. 

The general problem in such cases is to find, if possible, an analytic 

1 By expectation of life at any age is meant the average number of years lived 
by persons att ainin g that age, as given in the American Experience Mortality 
Table. 


ages. 



Age 

Expectation 

20 

42.20 

30 

35.33 

40 

28.18 

50 

20.91 

60 

/4JO 

70 

6.46 

60 

4.39 

90 

/ • 42 


130 



Curve Fitting 


131 


expression of the form y = f(x) for the functional relationship sug¬ 
gested by the data. Equations obtained to fit observed data as well 

as possible are called empirical w 7 0 v ^ 

. . . _ « , Example 2. Yearly Production of Cigarettes 

to distinguish them from the in the United States. 

rational expressions of pure 

mathematics which can be de- , 00 

rived from reasoning. This 9Q 

general problem is called curve ao 

fitting. It is also sometimes 70 

referred to as “ smoothing ” 6o 

the given data. 

We will consider three types , 

J 923 24 '25 26 27 '26 

ol functions: Linear, quadratic , 
and exponential. 

2. Linear Functions. The 
simplest curve is a straight 
line, in which case the form 
the function fix) takes is one 
which does not involve any 
power of x higher than the 
first. Such a form is there¬ 
fore called linear. We know from algebra that the general form of a 
linear equation in two variables is 

Ax + By — C 

where A, B, and C are arbitrary constants. 

When 5^0, the equation may be solved for y, giving y = 

— ( A/B)x — C/B which is of the form 

(1) y = mx + k 

and which is the form we will ordinarily use to represent a straight line. 
The special cases where A or B or C are zero are as follows: 

When A = 0, then y = C/B , which is of the form y — k. This is 
a line parallel to the x-axis. When B — 0, the equation takes the 
form x = k which is a line parallel to the y-axis. When C = 0, then 
Ax + By = 0 which is a line passing through the origin. 

- A linear equation defines the functional relation when one variable 
changes ~at a constant rate with respect to another. Before proving 
this in the theorems below, we will define “ rate of change.” By rate 
of change of y with respect to x we mean the change in y per unit 





132 


Mathematics of Statistics 


change in x. Thus, if (x h 2 / 1 ) and (x 2) 2 / 2 ) are two pairs of values, the 
rate of change is given by the ratio 

2/2 - yi 

- • 

X 2 — Xi 

Theorem I. If y — f(x) is linear , the rate of change of y with respect 
to x is constant. 

Proof: By hypothesis, y = fix) may be written y = mx + k where 
m and k are constants. If (x h yi) and (x 2 , 2 / 2 ) are any two points on 
the line then their coordinates substituted for x and y in the equation 
of the line must satisfy it. Hence, 

2 /i = mx 1 + k 
2/2 = mx 2 + k 

and subtracting, we have 2/2 — 2/1 = m(x 2 — Xi) whence 


m = 


Vi ~ yi 

X 2 “ Xi 


Since m is a constant the theorem is proved. 

Theorem II. If y changes at a constant rate with respect to x , then 
y = f(x) is linear . 

Proof: Denote the rate of change by m, and let 2/1) be a given 
pair of values of the variables. Then taking any other pair of values 
(x } y) we have 

y - 2/i 

m =-> 

X — X\ 


whence y — 2/1 = m(x — Zi) which is linear. 

The quantity 



m = 


2/2 - 2/1 
x 2 — Xi 


is called the slope of the line. 

It is the tangent of the angle of 
inclination a (alpha). Lines having 
the same slope are parallel and con¬ 
versely. 

It is shown in Analytic Geometry 
that we may obtain the slope of a straight line from its equation if we 
solve for y and take the coefficient of x. Thus in 2x — y = 5, 
y = 2x — 5 and the slope is 2. 



Curve Fitting 133 

Conversely, if we know the slope of a line and the coordinates of any 
point on the line we can write its equation from the relation 

( 2 ) y — 2/1 = m{x — Xi) 

which is called the 'point-slope form of a straight line. Thus, given 
that ( 2 , — 1 ) is a point on a line whose slope is 2 , the equation of the 
line is therefore y + 1 '= 2(x - 2 ) or 2x - y = 5 . 

Or again, remembering that m is defined by a ratio involving the 
coordinates of two points on a line, we can obtain the equation of a 
line if we know any two points which lie on it. From the definition 
of m and ( 2 ), we have 

<3) (*_*,) 

— X\ 

which is known as the two-point form of a straight line. Thus, given 
that (2, — 1 ) and (6, 7) are two points on a line, its equation is ’ 

7 + 1 - 

y + 1 = Q~l ~~2 ( x -~ 2)_ or 2x — y = 5. 

3. Quadratic Function. A quadratic function of a variable v 
is a polynomial of the second degree in v which may be expressed in 
the form Av 2 + 2 Bv + C where A, B, and C are fixed real numbers. 
The minimum value of such a function is useful in statistics. We 
have 

Av 2 + 2Bv + C = j [. A 2 v 2 + 2ABv + AC\ 

- + B) 2 + (AC - B*)\. 

Since {Av + B) 2 is positive or zero and {AC - B 2 ) does not involve 
the variable, we have the following: 

Theorem III. If A is positive the minimum value of Av 2 + 2Bv + C 
occurs when Av -)- B = 0; the minimum 
value is {AC — B 2 )/A. 

The graph of the equation y = Av 2 + 

2 Bv + C y (A > 0 ), is a parabola which 
opens upward and whose vertex is where 
v = — B/A. Of course the function has its 
minimum value at this vertex, viz.: (v 0 , y 0 ) 
where v Q = - B/A , y 0 = {AC - B 2 )/A. 



134 


Mathematics of Statistics 


Exercises 

1. (Wilson and Tracy) The premium ($y) on a $1000 life insurance policy for 
various ages (x yrs.) is given in the following table. Draw a graph ex¬ 
hibiting y as a function of x. Estimate from the graph the premium at 
age 32 and at age 43; also the age at which the premium is $52. 


X 

20 

25 

30 

35 

40 

45 

50 

55 

60 

y 

18.78 

21.02 

23.86 

27.54 

32.36 

38.83 

47.68 

59.88 

76.94 


2. Find an equation of each of the lines through two points given as follows: 

(a) (2,6), (4,5); ( b ) (0,3), (1,6). 

3. Find the equation of a line through the point (2, 3) and parallel to the line 

4x + by = 7. 

4. (a) Find the value of x for which f(x) = 2x 2 - Sx + 9 has a minimum 

value. ( b ) What is this minimum value? (c) Draw a graph of y = }{x) 
and show the meaning of your answers to (a) and (6). 

6. How would the theorem in §3 be affected if A <0? 

6. Prove that the second moment of x is a minimum when taken about the 
mean of x. 

Hints . Solution 1. 

Let f(v) = —Efe “*0 2 

N i 

1 N 

— v 2 — 2xv + ttE^ 2 - 
N i 

By the theorem of §3, show that/(v) is a minimum when v = x. 

Solution 2. By definition, 

1 N 

M2 = — Efe “ S) 2 

N 1 

\N 

= t;E(^ - v ** & 

N i 

IS H 2 < ^2? 

Solution 3, for calculus students. From/(v) as defined above, 

2 ^ 

f(v) = - — Efe - v). 

H 1 

Set f'(v) = 0 and solve for v. Since f"(v) > 0, v = x yields a minimum 
not a maximum. 

4. Fitting a Straight Line. The preceding discussion is intended 
as a basis for the presentation of certain methods of fitting a line to 
data. The equation y = mx + k represents a whole system or set 



Curve Fitting 


135 


of lines corresponding to different values of the arbitrary constants 
m and k, Such constants are called parameters. The process of 
finding the best fitting line for any given data consists in determining 
m and k. By “ best fitting ” we mean best under a criterion of ap¬ 
proximation specified by a method. We will consider three such 
methods: (a) graphical , Q>) the method of moments of ordinates , (c) the 
method of least squares . 

5. Graphically. A straight line is drawn (preferably with the aid 
of a transparent ruler) to fit as closely as possible the plotted points. 
To find the equation of this line, select two points on the line and esti¬ 
mate their coordinates (x h y x ) and (x 2 , 2 / 2 ). Substituting these coor¬ 
dinates in the “ two-point ” form of the line (3), we get the desired 
equation. 

If the first point is chosen so that x x = 0 the numerical work of 
simplifying the equation is somewhat lessened. 


Example 3. Fit a line graphically to 
the data in Example 2. 

We take the origin of x at 1923, 
hence from the figure (xi = 0, y\ = 67) 
and ( x 2 = 5, y 2 = 100). 

By equation (3), 



Therefore, 

y - 6.6x + 67 
is the required equation. 

The graphical method is open 
to the objection that it depends 
upon the judgment of the investi¬ 
gator. Different people will lo¬ 
cate the line in different positions and therefore obtain different equa¬ 
tions. However, where only approximate results are needed it is 
usually quite satisfactory. 

6. Method of Moments. In equation (1) y is not only a function 
of x but it is also a function of the parameters m and k. This func¬ 
tional relationship may be expressed symbolically by the notation 
f(x, m y k). Given the functional form of a curve y = /(x, a, 6, 
c, • • •) the parameters a, h, c, • * *, may be determined by obtaining 
expressions for as many moments of the computed or functional y' s as 


X 

y 

(1923) 0 

66.7 

1 

72.7 

2 

62.3 

3 

92./ 

4 

93.0 

( 1923 ) 5 

100,6 



136 


Mathematics of Statistics 


there are parameters in the function and equating these to the numeri¬ 
cal moments of corresponding order of the observed or empirical y’s. 
A solution of the resulting equations, theoretically possible, gives 
the “ best ” values of the parameters. This is the method of moments 
of ordinates. For a set of N values of (x», yf) the rth moment of y is 
defined by the expression 


1 N 

N^r x ' Vi ' 


In fitting a straight line by this method we obtain two equations 
involving m and & if we equate the zeroth and first moments of the 
observed y’s to the zeroth and first moments, respectively, of the y’s 
computed from the assumed equation y = mx + k. All moments 
are taken about the origin of x. These two equations may then be 
solved for m and k. The procedure will be made clear by the figure 
and explanation below. 



X 

0 y 

X 

c y 


Vi 

X ! 

mxj + k 


y 2 

X 2 

mx 2 + k 

• • 

• • 

• • 

• • 

*1 

yi 

X l 

mxj + k 

# • 

• • 

• • 

• • 

*n 

y n 


mx, + k 




Curve Fitting 


137 


Suppose we are given N pairs of values of x and y. Denote the 
given or observed y’s by o y and the computed y ’s by c y. For the 

observed y’s, the first moment is and the zero th moment is 



By a “ computed y ” corresponding to any value of x we 


mean the result obtained by substituting that value of x in the equa¬ 
tion y = mx + k, and solving for y. Thus, for any value of x, say 
Xi, we obtain mxi + k for the corresponding computed Graphi¬ 
cally, it is an ordinate of the fine. Therefore, the first moment of 


the computed y’s 


is —T Zxiimxi + k), and the zero th moment is 
N 


jjE,Xi°(mXi + k). 


Applying the principle of moments we have 


observed computed 
zero th moment + &) 

first moment ^ Xi(mXi + k) 


where the summations run from 1 to N. 

To solve for m and k we write the preceding equations in the follow¬ 
ing form: 


+ kN = 

+ k^Xi - 2Zx»j/<• 


By determinants, 


Ey 

Exy 

N 

Ex 

(Ey)(Ex) - ivExy 

Z> 

2> 2 

N 

Ex 

(I» 2 - WE* 2 

Ex 

Ex 2 

Ey 

Exy 

(Ex) (Exy) - EyE- 


D D 


The deter minan t D in the expression for k is the same as that in the 
denominator of the expression for m. The terms in the expressions 
for m and k refer to the original data. When these expressions have 
been evaluated they replace m and k in the equation y = mx + k. 



138 Mathematics of Statistics 


Example 4. Find by the method of moments the best fitting line for the data 
in Example 2. 



Therefore, 


= (507.4) (15) - 6(1388.6) 

(225) - 6(55) 6-86 

, _ 15(1388.6) - 55(507.4) 

k ---— = 67 . 4 ; 


D 

y = 6.86a; + 67.4. 


7. An Alternative Procedure. In practice, it is sometimes easier to 
remember the procedure of fitting a line by the method of moments if 
one obtains the equations in (4) directly from the data instead of using 
the formulas for m and k. This will involve the following three steps: 

(a) Substitute each of the given pairs of values in y = mx m + k and 
add the resulting “ equations.” This gives the first equation in (4). 

(b) Multiply each “ equation ” in (a) by the coefficient of m in 
that equation and add the resulting “ equations.” This gives the 
second equation in (4). 

(c) Solve the equations simultaneously. This will give the re¬ 
quired values of jn and k. 

Example . Verify, for the data in Example 2, that the above procedure gives 
the same values of m and k as the formulas. 


Step (a) 

66.7 * 0 m + k 

72.7 = 1 m + k 
82.3 — 2 m k 
92.1 =3 m + k 
93.0 — 4m -f- k 

100.6 = 5 m + k 


507.4 — 15m + 6k 


SUp ( b) 


72.7 = 

m 

+ 

k 

164.6 - 

4m 

+ 

2k 

276.3 « 

9m 

+ 

3k 

372.0 - 

16 m 

+ 

4k 

503.0 = 

25m 

+ 

5k 


1388.6 - 55m -f 15& 


Step ( c) 


Solving the equations, we obtain m = 6.86, k = 67.4, as before. 



Curve Fitting 


139 


8. Least Squares. Case I. A standard method of fitting a curve 
to empirical data is one known as the method of least squares. As¬ 
sume, as before, that the plotted data suggests the linear relationship 
y = jrix + k. Let d represent the difference between the ordinate 
of any given point and the corresponding ordinate of the line, that 
is, di = [yi — {mxi + k)]. These dif¬ 
ferences are called residuals. A re¬ 
sidual is found for every point, 
squared, and the results added. The 
method of least squares is based upon 
the principle that the most probable 
system of values of the constants is 
that which renders the sum of the 
squares of the residuals a minimum. 

Hence m and k are chosen subject to the condition that ^ di 2 is to 
be a minimum. Now 

Y^d 2 = ^[y - (jnx + k )] 2 

(6) = Nk 2 + 2 mk^x + m 2 ^x 2 — 2 k^y — 2 m^xy +^2/ 2 . 

This is a quadratic expression in k. We may write it in the form 

(6a) f(k) = Nk 2 + 2fc(mX> - X» + C 

where C represents the terms not involving k. Then according to 
Theorem III the minimum value of f(k) occurs when 

= Hv - m H x 

N 

that is, when 

Nk + mY^x — " 0- 

Equation (6) is also a quadratic expression in m. We must choose 
m so that 

m£x 2 + k^x — 2 X V = 0 . 

These last two equations 1 are the same as (4). When obtained by 
the method of least squares they are called normal equations. There- 

1 The student of calculus would obtain these equations as follows. Let 
f(m, k) = — mx — k) 2 . Then differentiate /(w, k) partially with respect 

to m and k , respectively, and equate the results to zero. 




140 


Mathematics of Statistics 


fore the values of m and k in (5) determine the best fitting line both 
by the method of moments and of least squares. It can be shown 
that the two methods give the same result for any polynomial. 1 

It is interesting to observe that the sum of the residuals is zero. 
Thus it can easily be shown that ^[y — (mx + k )] = 0, when the 
values given in (5) are substituted for m and k. This property and 
the fact that the sum of the squares of the residuals is a minimum are 
quite analogous to two similar properties of the arithmetic mean, viz ., 

(1) The sum of deviations from the mean is zero. 

(2) The sum of the squares of deviations from the mean is less than 
the sum of the squares of such deviations taken from any other value, 
i.e., ix 2 < V 2 . 

Case II. In Case I distances between the points and the line were 
taken parallel to the y- axis. But we may just as logically, from a 
formal point of view, take distances parallel to the z-axis, and make 

the x residuals the basis for a least- 
squares criterion of best fit. Simi¬ 
larly, for the method of moments: 
dj=x—(m 2 y|+b) we can set up two equations such 
that the first moment of the ob¬ 
served x’s equals the first moment 
— of the computed x’s , and the zero th 
moment of the observed x’s equals 
the zero th moment of the computed. 
To do this let x = m 2 y + b rep¬ 
resent the equation of the line. Then by the principle of moments 
we have 

y + &) 

'TjXy = + b). 



Solving for m 2 and b we obtain 


(7) 


m 2 = 


b = 


- Nj2 x v 


D 


- X>2> 


D 


D = (£y) 2 - N'Zv 2 - 


1 See American Mathematical Monthly , September, 1923. 



Curve Fitting 


141 


If we determined ra 2 and b by making the sum of the squares of the 
x residuals a minimum we would get the results given in (7). 

In general, Cases I and II will give different lines. Case I assumes 
that the observed points fail to fall on the line because of an error 
in the ordinates only. Case II assumes that only the ^-coordinates 
are in error. In the application of curve fitting to economic data, 
etc., the formal mathematical procedure should not be used without 
first verifying that the underlying assumptions involved in the pro¬ 
cedure are justified. Inasmuch as the independent variable x can 
be controlled in experimental and observational data, the errors 
usually exist only in the y’ s. Therefore in speaking of the best 
line by the method of moments or least squares it is conventional 
to mean the line which fits best in the sense of (5) rather than (7). 

Case III (,for calculus students). A third line can be obtained 
which fits best in the sense that 

/ 

the sum of the squares of the y 
perpendicular distances from the 
points to the line is a minimum. 

Let us suppose the equation of 
this line to be in the form 

y r — mx' + k 

where x r = x — x, y f = y — y, and 
(x, y) is the mean of the observed 
data. The distance di from this 
line to a point ( x /, y/) representing 
a pair of observed values (referred to their respective means as origin) 
is, from analytics, 

yi — mxi — k 

di =- —= —. 

Vm 2 + 1 



1 N 

We wish to make tzT 'df a minimum. 
N i 


Therefore we are to choose m 


and k so that the function 


/<». *> - - *>’} 

is a minimum. This function may be written in the form 

/(m, k) = • - * — ■ (cry 2 + k 2 + m 2 (r x 2 — 2mra v <r x ) 

m* + 1 


142 


Mathematics of Statistics 


where r is a convenient symbol defined by the relation 

1 N 

ra y <r x = 

1 

To make f(m, k ) a minimum we first put k 2 = 0. Then we equate 
to zero the first derivative with respect to m and obtain 

m 2 r<r y cr x — m(<r v 2 — <r* 2 ) — r<r y <j x = 0. 

Solving for m we have 

... W - <r* 2 ) ± [W - a x 2 ) 2 + 4 rW<r x 2 ] 112 

Ul — ‘ ~ ~— ----— • 

2ra y (x x 

Therefore the required equation is y' = mx'. Referred to the origin 
of x and y , this is 

y — y = m(x — x) 


where m is determined above. 

This line is the appropriate one to fit if there are errors in both x 
and y of the empirical data. 


Exercises 

1, Fit a line to the following data by Case I: 
Ans. y = —.5x + 8. 


x 

6 

7 

7 

8 

8 

8 

9 

9 

10 

y 

5 

5 

4 

5 

4 

3 

4 

3 

3 


2. Show that = 0 for Exercise 1. 

3. Using the values given in (5) for m and k show that Y,lv - (rnx + k)] = 0. 

4 . Verify the expressions for m 2 and b given in (7). How would you modify 

the “ alternate procedure ” so it will apply to m 2 and 6? 

6. Fit a line to the data of Example 2 by the method of Case II. 

9. Simplification. The formulas for m and k may be simplified. 
For certain purposes it may be desirable to make the transformations 
x' = x — x and y' = y - y. This has the effect, graphically, of 
translating the origin to the point ( x , y) so that the y-axis is moved 
to the value x, and the x-axis is moved to the value y. Let the equa¬ 
tion of the line with reference to these new axes be y' = mix' + k x . 




Curve Fitting 143 



The formulas for m\ and ki will be the same as for m and k except 
that x will be replaced by x f and y by y f . Hence 

jvI>V - 

1 ivl >' 2 - (£x0 2 
k . 


But since x' is a deviation from the mean of x, ^x' = 0. Similarly, 
TV = 0. Hence the values of mi and fci reduce to 


( 8 ) 




fcl = 0. 


Therefore the line goes through the new origin, and its equation is 
(9) y' = mix' 

where m x is defined in (8). 

The above transformation may not lighten the computations unless 
the values of x or y are equispaced. However, it does simplify the 
theory in certain applications, particularly in correlation theory 
(Chapter VIII). 

10. Time Series. If one of the variables is time, as in Examples 1 
and 2, the data are called a time series. The best fitting line is then 
commonly called a trend line or trend. In the process of fitting a 
trend line, a first simplification, obviously, is to take the origin at one 
of the given dates as we did in Example 3. But a much greater 
simplification is possible, if the z’s are equispaced, as they usually 
are in a time series. Denote the common differences of the x’s by c 
and the mid-date by 2. Then we may shift the origin to x and change 



144 Mathematics of Statistics 


the unit of measurement along the horizontal axis to c. Thus we may 
let 


( 10 ) 

where 

(ID 


t = 


x — x 


X = 


XI + X N 


if the x’s are equispaced. 

Let us think now of our line in ( t , y) coordinates, and let its equation 
be y — at + b. Our problem is to find a and b numerically from the 
given data, as we found m and k before. Our normal equations will be 

Hy = Y>( at + b ) 

J2ty = ^2(at + b)t. 

Since Vi = -^0* — *) = 0, and = Nb > tlie above equations 
c 

are readily solved, giving 

(12) a= W’ b = j£ y ' 

The student should remember that this simplification can be used 
only when the x’s are equispaced. 


Example 5. Find the trend line for the following data. Here c = 5, and from 
(11) x - 10. 


X 

y 

t 

ty 

t 2 

0 

12 

-2 

-24 

4 

5 

15 

-1 

-15 

1 

10 

17 

0 

0 

0 

15 

22 

1 

22 

1 

20 

24 

2 

48 

4 

Sums 

90 


31 

10 



3.1, 



From (12), 



Curve Fitting 


145 


So the required equation is y — 3.If + 18, with reference to the new origin and 
units. If we wish it in terms of z, we substitute 



and obtain y — .62z + 11.8. 

Example 6. Same as Example 5, with another observation added. Note that 
when there is an even number of observations, the values of t are fractional. 
In this case it is convenient to use the column headings 2 ty instead of ty, and 4Z 2 
instead of t 2 . 


x 

y 

t 

2 ty 

4* 2 

0 

12 

-5/2 

-60 

25 

5 

15 

-3/2 

-45 

9 

10 

17 

-1/2 

-17 

1 

15 

22 

1/2 

22 

1 

20 

24 

3/2 

72 

9 

25 

30 

i 

5/2 

150 

! 

25 

Sums 

120 


122 

i 

70 


x - 12.5, = 61 > X> 2 * 17.5 

a = 3.49, 6 = 20 

y = 3.49* + 20 

tx - 12.5\ 

y = 3.49 f--- 1 + 20 

y - .7x + 11.28. 

11. Exponential Trends. When the given y values form a geo¬ 
metric progression while the corresponding x values form an arith¬ 
metic progression, the relationship between the variables is given 
by an exponential function, and the best fitting curve is said to 
describe an exponential trend. Data from the fields of biology, 
banking, and economics frequently exhibit such a trend. Thus, the 
growth of bacteria is exponential. Money accumulating at com¬ 
pound interest follows the same kind of law of growth. And in busi¬ 
ness, sales or earnings may grow exponentially over a short period. 

The characteristic property of this law is that the rate of growth, 
that is the rate of change of y with respect to x, at any value of x is 
proportional to the value of the function for that value of x. The 
function 


y = Ae Bx 


(13) 


146 


Mathematics of Statistics 


is a mathematical statement of this property. 1 The letter e is a fixed 
constant, whereas A and B are parameters to be determined from 
the data. If y decreases as x increases, B is negative. An interesting 
example of this case is the disappearance of radio-active substances 
like radium. 



Fra. 28 — General Appearance of the Graph of (13) 

To assume that the apparent law of growth will continue is usually 
unwarranted, so only short range predictions can be made with any 
considerable degree of reliability. When the exponential character 
of the observed phenomenon ceases a saturation point is said to be 
reached. 

The parameters A and B. If we transform (13) so that it is linear 
with respect to its parameters we may use the methods for fitting 
a straight line to determine A and B. To this end we first take the 
logarithms of both sides of (13), obtaining 

(14) log y = log A + ( B log e)x 
which is of the form 

(15) F = k + mx 

where F = log y, k = log A, m = B log e. 

If we look up the logarithms of the given y’s and denote them by Y , 
we may fit the equation F = mx + h to the ( x , F) values by deter¬ 
mining m and k by means of the formulas given in (5). In using 
these formulas we must remember to replace y by F. After m and k 
are determined, A and B may be obtained from the relations 

A = anti-log of k 

B = m /log e, where log e = log 2.718 

= .4343. 

1 The student of calculus will understand that “ rate of change ” is used here in 
the derivative sense. For (13), dy/dx <* y. 



Curve Fitting 


147 


The student may be interested to verify that the relation Y = mx + k 
can be put back into the form (13). We may write (14) in the form 

y = l() log A + ( B log e ^ x 

= {io log A } { io 108 4 } Bx 
= A(P X . 

The last step follows because lo lo81oiV = N by definition of logarithm. 

Example 7. Find the exponential trend for the following data, and draw the 
curve. 


X 

y 

Y 

xY 

X 2 

1 

1.6 

.2041 

.2041 

1 

2 

4.5 

.6532 

1.3064 

4 

3 

13.8 

1.1399 

3.4197 

9 

4 

40.2 

1.6042 

6.4168 

16 

5 

125.0 

2.0969 

10.4845 

25 

15 


5.6983 

21.8315 

55 


From (5) we have, 

D = (!>)* - NY.X* 
m = i £7^1 - JVXzYl 

Therefore, 

D = [(15) 2 - 5(55)] - -50 
m = -^1(5.6983) (15) - 5(21.8315)] - .4737 

k = [15(21.8315) - (5.6983) (55)] 

- -.2813 = 9.7187 - 10. 

And log A = 9.7187 - 10, hence A = .5232 


Therefore the required equation is 

y « .5232c 1 * 091 *. 
















148 Mathematics of Statistics 


When the x’s are equispaced, as here, the work may be simplified by using (10) 
and fitting a line 


Y = at + b. 


The problem now is essentially the same 1 as in §10 where a and b are defined in 
(12) except that we are now dealing with ( t , F) coordinates instead of (2, y). 

The method is illustrated below. 


t 

Y 

tY 

t 2 

-2 

.2041 

-.4082 

4 

-1 

.6532 

-.6532 

1 

0 

1.1399 

0.0000 

0 

1 

1.6042 

1.6042 

1 

2 

2.0969 

4.1938 

4 

t = x - 3 

5.6983 

4.7366 

10 



as before. 

For purposes of plotting, pre¬ 
dicting, or interpolating, values 
of y in (13) may be obtained by 
means of the intermediate form 
(15). So to graph the above 
equation we assign values to x, 
as in the following table, com¬ 
pute the corresponding values of Y and then obtain the values of y from a 
table of logarithms. The curve in ( x , y) coordinates is shown in Figure 29. 


From (12) 

„ T.tr 


4.7366 

1 tt - 4,37 

tm _ U3W . 


So Y - .47372 + 1.1396. 

Transforming this into (x, Y) co¬ 
ordinates we have 

Y = .4737(x - 3) + 1.1397 
- .4737x - .2814 


1 The critical reader will realize that fitting a straight line to the values of log y 
is not quite the same as fitting an exponential to the values of y. However, the 
discrepancy usually does not affect the fit seriously. For a method which is free 
from this difficulty, see Glover's Tables , p. 468, 




Curve Fitting 


149 


X 

l 

2 

3 

4 

5 

6 

Y 

0.1923 

0.6660 

1.1397 

1.6134 

2.0871 

2.5608 

y 

1.56 

4.63 

13.79 

41.06 

122.2 

363.8 


12. Ratio Charts. In the graphical representation of data that 
exhibit an exponential trend, it is often desirable to use semi-logarith- 
mic paper. Such paper has a logarithmic scale in the vertical direction 
and a uniform scale in the horizontal direction. (Figure 30.) A 
logarithmic scale is one in which the distance from y — 1 to y — N 
equals log N. A “ cycle ” of rulings spaced according to the loga¬ 
rithms of the integers from 1 to 10 is the unit of the vertical log y 
scale. 

“ Semi-log ” paper may be constructed or purchased having one 
or more cycles. The appropriate number of cycles is determined 
by the range of y values in the data to be plotted. If the bottom line 
of the first cycle is labeled 1 and taken as the origin of log y 
(log 1 = 0), the beginning of the next cycle is read 10 (log 10 = 1), the 
next one above that is read 100 (log 100 = 2), etc. However, the 
beginning of the first cycle may be labeled with any number which 
is an integral power (positivepr negative) of 10, as .01, .1,10,100, etc. 
Corresponding lines in successive cycles are labeled with numbers 
which are 10 times those in the preceding cycle. Since y has no real 
logarithm if y ^ 0, neither zero nor negative numbers are found on 
a logarithmic scale. Plotting a point whose semi-logarithmic co¬ 
ordinates are (x, y) is equivalent to plotting the point whose rectangu¬ 
lar coordinates are (x, log y). 


Example. Plot y — 8 (2 X ) on semi-log paper. 

Solution. Assigning values to Z we form the following table, 


X 

-3 

-2 

-1 

0 

1 

2 

3 

4 

y 

1 

2 

4 

8 

16 

32 

64 

128 


from which we obtain the semi-logarithmic graph shown in Figure 30. 


We now state the following theorem. 

Theorem IV. If A is a 'positive constant , the ( x , log y)-graph of 
y = AeP x is a straight line . i 

Proof: Since (15) is linear in x and Y, its graph in (x, Y ) rectangu¬ 
lar coordinates is a straight line. 




ISO 


Mathematics of Statistics 


Semi-logarithmic graphs are also called ratio charts. Their useful¬ 
ness depends upon the property of logarithms that 

\og~ = log M- log N. 



It follows that the distance between any two ordinates of the chart 
measures the ratio between the values represented by these ordinates. 
Thus if 


then 

or 


2/2 y 4 ^ 

log Vl “ log 2/2 = log 2/3 — log 2/4 
Y 1 - Y 2 = Ys - F 4 , 


that is, equal ratios are represented by equal vertical distances. 
Likewise, if 


then 


yi > y* 

2/2 2/4 


Y x - Y 2 > Y 3 - Y 4 


and the larger ratio is represented graphically by the larger distance. 
These differences of elevation are independent of any base line. 
The same percentage increase in y is represented by the same addition 


Curve Fitting 


151 


to the height of Y in all parts of the chart. Hence, it is easier to 
depict and discover ratios of change on ratio charts than on ordinary 
charts. 

The analysis of time series in economic statistics is often facilitated 
by forming “ link relatives ” which are ratios of each ordinate (after 
the first) to the preceding ordinate. Thus, if yi, 2 / 2 , * * •, y n are the 
given values, the link relatives are 


p 2/ 2 P V 3 

tt\ — y it 2 — —> 


Rn-l = 


Any link relative R denotes the rate of change in y from one month 
(say) to the next. If the y’s are plotted on ratio paper they will lie 
on a straight line when the R’s are equal, on a curve bending upward 
when the R’s are increasing, and on a curve bending downward when 
the R’s are decreasing. It follows that if two curves are parallel on 
ratio paper their rate of increase (or decrease) is the same. 

For further discussion of ratio charts the student is referred to the 
books of Bivins and Haskell (see §7, Introduction). 

13. Further Remarks on the Exponential Function. Equation (13) 
is sometimes called the compound interest law because it describes 
the way money would grow if interest were compounded continu¬ 
ously. If P dollars are invested at a nominal rate j% compounded 
m times a year, the amount S after x years is given by the formula 

s - p ( 1 + i)~ 

If j is compounded continuously or, in other words, if m is taken 
indefinitely large (written m—► <*>), the amount S does not increase 
indefinitely but approaches a limiting value. We may write the 
expression for S in the form 



If we let N = m/j, we have 



It can be shown in the calculus 1 that, as N —> 00 , the quantity 
1 The teacher can give appropriate references. 



152 


Mathematics of Statistics 


^1 -f- approaches the limit called e. Thus we have 

lim ( 1+ 4Y = e = 2.718 • • • 

A r — ► oo \ N / 


This limit is also the base of the Napierian, or natural, system of 
logarithms. As rrt •—> oo so does N —» oo. Therefore in the ideal case 
of continuous conversion of interest, we have the limiting form 


S = lim P 

m —► oo 


= lim P 

N-+ 00 



that is 


S = Pd* 


which is of the form (13). 

There are several other forms of the exponential function. For 
example, if we let r = e B , (13) becomes 

y — Ar* 


which is the general term of a geometric progression whose first term 
is A and common ratio is r. 

If B is negative in r = e B then r < 1. So (13) is a decreasing func¬ 
tion when B is negative. 

If we let 10* = e B y (13) becomes 

y = A 10 ** # 


Then k — B logio e and k differs from B by the factor logio e. This 
factor is known as the modulus of the system of logarithms of base e 
with respect to the system of base 10 . 

14. Parabolic Trend. Data of broad economic or social signifi¬ 
cance extending over a long period of years may often be described 
by an arc of a second degree parabola. The equation of a parabola 
is of the form 

y — a + fix + yx 2 


where a , 0 , 7 , are the parameters to be determined. 
If the z’s are equispaced we may let 





Curve Fitting 153 

where x = (xi + x N )/2 and c = | Xi+i — x t |, and thereby effect 
considerable simplification in evaluating the constants. In t and y 
coordinates the equation will, of course, involve different constants 
and we may write its equation in the form 

( 16 ) y = A + Bt + Cf. 

The method of moments may again be used and since (16) is a poly¬ 
nomial this method also gives the best fitting curve in a least-squares 
sense. Because there are three constants to be determined we must 
equate the second moments as well as the zero th and first moments. 
Imposing these conditions of moments between the observed and 
computed ordinates, we obtain the three normal equations: 

T,v = na + B^Zt + 

5 > = + <?!> 

X>2/ = A]>> + 

Since the mean is chosen as origin = 0. With this choice of 
origin and because the x’s are equispaced it can be shown that 
= 0. Therefore the normal equations simplify into 

o _ Tm 

( 17 ) 

;uv+c5> = Ey 
. + C 2 > = y . 

When the summations involved in these equations are evaluated 
from the data the values of A, B, and C can easily be determined. 

Example 9. Fit a parabola to the following data. 


Number of Divorces per 1000 Marriages in the United States 

1900-1930 


Year 

y 

X 

t 

ty 

t 2 

t*y 

t 4 

1900 

81 

0 

-3 

-243 

9 

729 

81 

1905 

84 

5 

-2 

-168 

4 

336 

16 

1910 

88 

10 

-1 

- 88 

1 

88 

1 

1915 

104 

15 

0 

0 

0 

0 

0 

1920 

134 

20 

1 

134 

1 

134 

1 

1925 

148 

25 

2 

296 

4 

592 

16 

1930 

170 

30 

3 

510 

9 

1530 

81 

Sums 

809 

x = 15 


441 

28 

3409 

196 



154 


Mathematics of Statistics 


From (17), 


„ 441 

28 

7 A + 28 C = 809 
28A + I960 = 3409. 


Solving the last two equations simultaneously we obtain, 


A = 


322 
3 ’ 


C - 


173 

84 * 


Therefore, 


,322 + 441 173 

3 28 84 


If we desire the equation in the original form we substitute t = ~ 15) and 

obtain 



which simplifies into 

y = 78.62 + .6 Sx + .0824z 2 . 

Upon the hypothesis that divorces continue to increase during the next decade 
according to this trend, we may estimate the number for 1940. When x = 40 
in the above equation, we find y = 237.66. 

15, The Gompertz Curve. The curve which bears his name was 
suggested in 1825 by Gompertz for use in actuarial science. Recently 
it has had some application as a growth curve in business and popula¬ 
tion forecasting and in certain problems in education. Its equation 1 
is 

(18) y = kg**. 

To determine the parameters, we first transform (18) into the loga¬ 
rithmic form 

(18a) Y = K + Gc* 

where Y - log y, K = log k, G = log g. The number, N, of obser¬ 
vations available must be such that N = 3n where n is the number in 
each of three subgroups with no observations omitted; that is, N 
must be divided into three blocks of data consisting of n items each. 
It is also necessary that the values of the independent variable be 

i For a derivation see Mathematical Theory of Life Insurance — Forsyth. 
John Wiley and Sons, Inc. 



155 


Curve Fitting 


equispaced. Then the data can be put in the form shown below. 
If the given values of x are substituted in (18a) we obtain the three 
sets of functional F’s shown in (a), (b), and (c). 


0 

Fo 


F 0 = if + Gd> 


1 

• • • 

Yi 

• • « 

n—1 

EF, 

i = 0 

Yi = K + Gc 

(a) 

71 — 1 

F»_i , 


Yn—l ~ K -j- Gc n ~ l 


n 

F„ 1 

2n—1 

Ef 4 

tan 

F„ = K + Gc" 


n + 1 

Fh-1 

Fn+l = if + Gc”+ l 

(6) 

2n — 1 

F 2n _! . 


F 2 „_i = if -b Gc 2n ~ l 


2 n 

Y in ' 

3n—1 

Z Yi 

t = 2n 

Yi„ = K + Gc 2n 1 


2n + l 

F 2rt+l 

F i n +i = K -b Gc 2n+l j 

(c) 

3n - 1 

F s „_i . 

F 3 „_i = K + Gc 3 "- 1 J 



Let Si, S 2 , S 3 denote respectively the totals of the subgroups (a), 
(b), and (c). Thus we have 

51 = nK + G(1 + c H-b c"- 1 ) 

5 2 = nK + Gc n ( 1 + <H-b c"- 1 ) 

Si = nK + Gc 2n (l + c 4-b c*- 1 )- 

Then 

5 2 — Si = G(c n — 1)(1 + c + • • • + c" -1 ) 

5 3 — Si = Gc n (c n — 1)(1 4 * c + • • •■+ c” -1 ) 

whence we obtain 

S 3 — Si 

c" =-• 

Si-Si 


Writing the expression for S 2 — & in the form 


Si — Si = G 


(c» - l ) 2 
c — 1 


and solving for G, we obtain 


(Si - Si)(c - 1 ) 

--- ’ • 


(C n - 1)2 











156 Mathematics of Statistics 


The expression for Si may be written 

, <7(1 - c») 

Si = nK -1--- > 

1 — c 

so we have 


n L 1 — c 


In the above expressions, Si, S 2 , S 3 , denote sums of the functional 
F’s. If these are now replaced by the empirical data so that 

n — 1 2 n — 1 3n—1 

Si = E n & = £ r f , = E Fo 

0 n 2n 


where F t refers to the observed Y% then c can be determined 

from the expression for c n . 
Using the value of c, G can 
be determined, and then K . 

If c < 1, it is clear from 
(18a) that F —> K as a: —> 00 . 
Then y = k is an asymptote 
and k is sometimes called the 
ceiling of the curve. 

For an application of the 
x above method to a problem 
in business, see Statistical 
Methods (Revised Edition) by Mills, page 672. 

16. Remarks and References. The methods of least squares and 
moments do not select the appropriate curve. They merely deter¬ 
mine the “ best ” values of the parameters in the equation of the 
curve which has been selected previously to describe the observed 
data. The question of the type of curve which should be fitted to the 
data is not always easy to answer. The selection of the appropriate 
mathematical function depends to a large extent upon the investiga¬ 
tor’s experience in the field in which the problem lies and his knowl¬ 
edge of the properties of curves. It always helps to plot the data first. 
The usual requirements for practical purposes are that (a) the curve 
must represent well the trend of the empirical data, and ( b) the 
mathematical expression must not involve too many parameters and 
those present must be calculable from the data. In dealing with 
time series, if the objective is to find out what would happen if the 
percentage change should continue as it has on the average in the past, 






Curve Fitting 157 

then an exponential trend is indicated. If the objective is to find out 
what would happen if the yearly (or monthly, etc.,) change should 
continue as it has in the past, a straight line trend is indicated. 

For other elementary curves than those considered here and 
methods of fitting them we recommend 

1. An Introduction to Statistical Analysis — Richardson. 

2. The Principles of Financial and Statistical Mathematics — Philips. 

We will merely mention here two other important curves which 
require more advanced mathematics in their treatment. The logistic, 
or so-called Reed-Pearl curve, is used extensively in studying various 
growth phenomena. Its function is of the form 

1 

V = — r-rn 
a + be 

and it resembles somewhat the Gompertz curve discussed above. 
For further discussion of this curve and methods of fitting it see 

1. Elements of Statistics — Davis and Nelson. 

2. Statistical Methods , Revised — Mills. 

The function y = ks x g c '* 

is known as Makeham’s law. It is used in actuarial work. The stu¬ 
dent having a working knowledge of the calculus will find an inter¬ 
esting discussion of its use in the field of insurance in an article en¬ 
titled Makeham’s Laws of Mortality , Rietz, American Mathematical 
Monthly, vol. 28, p. 471. 

Exercises 

1. If the rate of change of y with respect to x is always proportional to the 

attained value of y then y is what kind of a function of x? 

2 . Determine A and B in the best fitting curve of the type (13) for the following 

data. 


Data 

Form for Computations 

X 

y 

t 

Y 

tY 

t 2 

0 

1000 





5 

100 





10 

10 





15 

i 





20 

.1 









158 


Mathematics of Statistics 


3 . (a) Prove formula (11). 

(b) Graph the curve y = 10e -2 *. 

4 . Find the best fitting parabola for the following points: (— 4, 2), (0, 8), (4, 9), 

(8, 11), (12, 8), (16, 5). Am. y = 7.2 + Mx - .07a: 2 . 

5. If the values of t form an arithmetic progression and £t = 0 prove that 

I> = 0. 

6. (a) Add the values x = 30, y = 37 to the data of Example 6 and find the 

trend line. Am. y - .Sx + 10.42. 

(6) On the hypothesis that the apparent trend continues, predict the value 
of y when x — 35. 

7. In a tensile test of a metal bar the following observations were made, where 

x represents the load in tons and y the elongation in ten-thousandths of 
an inch: 


X 

1 2 3 4 5 

y 

14 27 40 55 68 


Determine a linear relation between x and y by the theory of least squares. 

8. In the following table y represents the fire losses in the United States in 
millions of dollars. Taking the origin of x at 1915 find the best fitting 
line, in a least-squares sense, for the data. 


X 

1915 1917 1919 1921 1923 1925 

y 

172 290 321 495 535 570 


9. (a) Add the values x « 6, y = 300 to the data of Example 7 (p. 147) and find 
the equation of the best fitting exponential curve. 

Am. Y - ,4617x - .2534 
y — .566 1 - 06 *. 

(6) Plot the given data and the curve obtained in (a) on semi-log paper. 

10. Distinguish between the forms of the curves represented by the functions 
y = Ae~ Bx and y = Ke~ hx2 where A, B, K, and h are positive real num¬ 
bers. If these functions were plotted on semi-log paper what kind of 
curves would be obtained? 

Note . Source material for additional exercises on curve fitting 
may be found in the current volumes of the following publications: 

1. Statistical Abstract of the United States. 

2. World Almanac and Book of Facts. 






CHAPTER VIII 
CORRELATION THEORY 

1. The Meaning of Simple Correlation. So far we have been 
concerned with the problems which arise from variation in a single 
variable. We will now consider the simultaneous variation of two 
variables. Methods for disclosing the facts of co-variation and for 
measuring the degree of relationship existing between two variables 
are due mainly to the English biometricians Sir Francis Galton 
(1822-1911) and Karl Pearson (1857-1936). 

Data presenting two sets of related measurements or observations 
may arise in many fields of activity yielding N pairs of corresponding 
variates (x*, y%), i = 1, 2, 3, • • •, N. Thus x may represent July rain¬ 
fall and y the average yield of com in a certain section; x may be 
an index of commodity prices and y an index of employment over 
the same period; we may be interested in a group of school children 
in which x is their height and y their weight, or x may refer to their 
reading ability and y to their spelling ability; we may be studying the 
chance distributions which are obtained in throwing two dice where 
x is the number obtained in throws of a single die and y is the number 
obtained in throws of the two dice together. 

Example 1. In the following set of selected heights (inches), x = stature of 
father, y = stature of son. 


X 

69 

70 

69 

68 

70 

73 

69 

67 

69 

64 

y 

68 

69 

72 

67 

70 

71 

72 

66 

71 

65 


Example 2. ( Snedecor .) The following data on twelve trees are adapted from 

the results of an experiment to test the phenomenon that the injury by codling 
moth larvae seems to be greatest on apple trees bearing a small crop. Here 
x — hundreds of fruit on a tree, y = percentage of fruits wormy. 


X 

15 15 12 26 18 12 8 38 26 19 29 22 

y 

52 46 38 37 37 37 34 25 22 22 20 14 


159 







160 


Mathematics of Statistics 


Fig. 31 


When the given pairs of values are represented by dots locating 
the points whose rectangular coordinates are (x, y) we obtain a so- 
called “ scatter diagram ” (Figure 31). The problem is to determine 
the degree of association, or correlation as it is called, between the 
x’s and the corresponding y’s since this indicates the significance of 
the relationship. 

The field of correlation may be thought of as bounded on the one 
extreme by perfect functional dependence and on the other extreme 
by complete independence in the probability sense. For example, 

the pairs of values which satisfy the 
equation y = 2x — 5 do not present 
a statistical problem. In this case the 
relationship is defined by a mathe¬ 
matical function y = /(x). Similarly, 
at the other extreme we would not be 
concerned with pairs of values which 
are completely independent in the 
probability sense, as, for example, the 
grades of students in statistics and the heights of their fathers. Two 
variables are said to be statistically related when they lie between 
these two extremes of relationship. 

The theory of correlation is concerned with a twofold problem: 
first with measuring the indicated relationship, and secondly with 
predicting or estimating the average value of y associated with a 
designated value of x. 

2. The Coefficient of Correlation. It is fairly obvious from Figure 
31 that with values of x in an assigned interval Ax (Ax small) the 
corresponding values of y differ considerably. There is said to be 
positive correlation if, for an assigned x larger than x, the mean of the 
corresponding y values is larger than y, and, for values of x smaller 
than x, the mean of the corresponding values of y is less than y. 
On the other hand, as x increases the tendency may be for y to de¬ 
crease. In this case, for an assigned x larger than x the mean of the 
corresponding y values is less than y , and for an assigned x less than 
x the mean of the corresponding y’s is greater than y. There is then 
said to be negative correlation. If, for an assigned x larger than x 
a corresponding y is no more likely to be above than below y } the 
variables are independent in the statistical or probability sense and 
there is said to be zero correlation between them. 

When the variables are correlated there is a tendency for the dots 




Correlation Theory 161 

in the scatter diagram to fall into a sort of band having a fairly defi¬ 
nite trend. We are assuming that this trend is linear, and a theory 
built upon this assumption is known as simple or linear correlation. 

In Figure 32 the origin of the zV-axes is taken at (x, y). Then 
the points of the scatter diagram are distributed over the four quad¬ 
rants of the #y-plane. 


[y 

y' 

1 11 

i ; 

# 

1 • • 

? • • • 

1 . * * 

• * . * • 

• • • 

I 

.o'. • 

; 

- •/. 

• • . • • 
• • *. 
• • • • 

.* *.* * * ^ 

•* • . 

• • 

1 

1 

• • • • 

• • • 

• 


III 

IV 

~o 




Fig. 32 


The coordinates of the points in the four quadrants have algebraic 
signs as follows. In quadrant 

I, x r and y ' are positive; 

II, x r is negative and y f is positive; 

III, x f and y' are negative; 

IV, x r is positive and y' is negative. 

Therefore, the product x'y f is positive for all dots which occur in 
quadrants I and III and negative for all dots in quadrants II and IV. 
The algebraic sum of all such products describes the distribution of 
the dots over the quadrants. When this sum is positive the trend 
of the dots is through quadrants III and I, when it is negative the 
trend is through II and IV, and when zero there is no trend, the dots 
being equally distributed over the four quadrants in the sense that 
the positive products of x'y ' balance the negative products. Con¬ 
sequently, a natural measure of correlation would be obtained by 
summing the products x'y' for all the observed values and taking the 
average by dividing the result by N. Moreover, if we first express 








162 


Mathematics of Statistics 


%' and y r in units of their respective standard deviations we obtain a 
measure of correlation which is independent of the original units. 
This measure is universally denoted by r. Thus we have in symbols, 



It is variously called the total correlation , the product-moment co¬ 
efficient of correlation , and the correlation coefficient. 

We may give the following word definition: 

Definition. The correlation coefficient of two sets of variates ex¬ 
pressed in their respective standard deviations as units , is the arith¬ 
metic mean of the products of deviations of corresponding values from 
their respective means. 

3. Other Formulas for r. Although formula (1) is very useful for 
giving the meaning of the correlation coefficient, other formulas 
easily obtained from (1) are usually much better adapted to numeri¬ 
cal computation. Since <r x and <j y are constants (1) may be written 
as 

-x)(y~ y) 

(2) r = —- 

It is useful to think of this as 


r = 


co-vananee 


[(variance of a:) (variance of y)] 112 
Formula (2) reduces to 


( 3 ) 


- xy 


1 - FT -U/2 1 =iT 72 ' 


It will be proved later that r cannot be larger than 1 nor less than 
- 1 . 




Correlation Theory 


163 


Theorem I. The value of r is independent of the origin of reference 
and the units of measurement. 

Proof: Let u = „ = ?LzJL\ 

h k 

Then 

x = uh + x 0 , y = vk + y a , <r x = h<r u> 

Substituting in (2) we obtain 


Gy feffp 


(4) 


(4«) 


where 


r = 


~ L(u - u)(v - 5) 


(■T u&v 


-Yiuv - uv 
<T U <T V 


~ l “I 1/2 r i _ _ ”ii/2 

= |_jyZ « 2 - « 2 J , V, = |_^2> - f 2 J • 


Since (4) and (4a) are independent of the constants x 0 , y 0 , h, and k, 
the theorem is proved. 

When the given values of x and y are large and a computing machine 
is not available, the computations may be lightened by an appropriate 
choice of these constants. If only the origin of reference is changed, 
then h = k = 1, and u — x — xo, v = y — y 0 . If the means are taken 
as the origin of reference by letting x' = x — x and y' = y — y, 
then x' — y' = 0 and the formula becomes, 


(5) 


r = 


N 


I>y 




A subscript notation should be attached to r when there are several 
series of variates. Thus, r zy for the (x, y) series, r XM for the (, x , z) 
series, r u for the series denoted by (x h x 2 ), etc. 

Example 3. To illustrate the formulas we will compute the value of r for the 
following data. Here x = Brokers' Loans in billions of dollars and y » The 
Annalist's index of the prices of fifty rail and industrial stocks in 1929. We choose 
u = x ~~ 5.00 and v — y — 250. 



164 


Mathematics of Statistics 


Month 

X 

y 

u 

\ 

V 

uv 

u 2 

v 2 

J 

5.33 

248 

.33 

-2 

-0.66 

.1089 

4 

F 

5.67 

248 

.67 

-2 

-1.34 

.4489 

4 

M 

5.65 

243 

.65 

-7 

-4.55 

.4225 

49 

A 

5.56 

249 

.56 

-1 

-.56 

.3136 

1 

M 

5.53 

235 

.53 

-15 

-7.95 

.2809 

225 

J 

5.28 

( 265 

.28 

15 

4.20 

.0784 

225 

J 

5.77 

! 282 

.77 

32 

24.64 

.5929 


A 

6.02 

303 

1.02 , 

53 

54.06 

1.0404 

2809 

S 

6.35 

I 290 

1.35 

40 

54.00 

1.8225 

1600 

0 

6.80 

230 

1.80 

-20 

-36.00 

3.2400 

400 

N 

4.88 

201 

-.12 

-49 

5.88 

.0144 

2401 

D 

3.45 

206 

-1.55 

-44 

68.20 

2.4025 

1936 

Sums 


6.29 

0 

159.92 

10.7659 

10678 

1 

— Sums 

N 


.5242 

0 

13.3267 

.8972 

889.8333 


Compulations: <r u = [.8972 - (.5242) 2 ] 1/J = .79 

<r„ = [889.8333] 1 ' 2 = 29.83. 

From (4o) we have, 

13.3267 _ 

T ~ (29.83)(.79) 


Exercises 

1. When x ' and y f represent deviations from the means, 

(а) Show from (1) that Yi x 'y' = Nrcr z (r v . 

(б) Show that Na x 2 = £x' 2 . 

2. Derive formula (3) from (2). 

3 . Show that (3) may be written as 

__ NY,xy - Y, x Hv _ 

r “ [{#!> - (Ex) 2 } {nZv 2 - (Ev) 2 }] 1 ' 2 ’ 

4. Find r for the data of Example 1. 

5 . Find r for the data of Example 2. 

6. The following data represent the ages of husband (x) and wife (y) of twenty 

couples. Find r using (5). Ans. 0.856. 


X 

22 

24 

26 

26 

27 

27 

28 

28 

29 

30 

30 

30 

1 31 

32 

33 

34 

35 

35 

36 

37 

y 








24 








g 



g 

32 















Correlation Theory 165 


7 . In studying a set of pairs of related variates, a statistician has completed the 

preliminary arithmetic and obtained the following results: 

N = 100; 2> s = 1,585,000; £x = 12,500; Ew = 1,007,425; Ev* = 
648,100; Ev = 8,000. Find x, y, <r x , <r„, r. 

8 . The table in Exercise 2, page 92, contains the grades made on two tests by 

twenty-five students in mathematics. Find r for this data. Ans. 0.786. 

9 . Suggest examples of negative correlation. 

10 . In the following anthropometric measurements on a random sample of 
twenty male freshmen, taken from the Physical Education Department, 
x represents height, y represents chest measurement, both measurements 
being taken to the nearest tenth of an inch, and z represents weight to the 
nearest pound. Find the coefficient of correlation (a) between x and y , 
(6) between x and z, (c) between y and z. 


X 

y 

z 

X 

y 

z 

68.5 

33.6 

148 

65.3 

33.0 

136 

67.2 

35.0 

144 

65.1 

34.0 

144 

67.7 

30.2 

145 

64.8 

37.3 

170 

63.8 

30.0 

108 

69.6 

33.4 

154 

69.9 

33.0 

130 

68.2 

31.5 

122 

64.7 

31.0 

112 

68.8 

32.0 

141 

68.4 

33.0 

134 

72.3 

35.0 

159 

66.4 

30.2 

112 

67.8 

33.7 

134 

69.1 

33.3 

143 

71.3 

31.5 

136 

71.0 

32.3 

136 

63.5 

33.6 

126 


4. Regression. The properties of r can be studied by fitting a 
line to the scatter diagram in such a way as to make the sum of the 
squares of the vertical distances from the points to the line a mini¬ 
mum. 

When such a line is referred to the point (x, y) as origin, we have 
seen (§9, Chapter VII) that its equation is y f = m\X f where 


mi 


pcV 

'Ex ' 2 


and x f = x — x, y f = y — y. This value of mi may easily be ex¬ 
pressed in terms of r and the standard deviations, as follows: 


mi 


Nra y (T x 
Na x 2 


= r 


<Ty 


<Tx 


Therefore, the equation of our line, referred to a system of axes whose 
origin is at the means of the variates, is 



(6) 




166 


Mathematics of Statistics 


This is called the regression line of y on x. The term regression 
was used first by Galtorx in studying inheritance of stature. He 
found that offspring of abnormally tall or short parents tend to 
“ step back ” or “ regress ” to the ordinary population height. 
However, as now used, regression line has no reference to biometry, 
but is merely a convenient term. 

By fitting a line x ' = m^y' to the points of the scatter diagram in 
such a way that the sum of the squares of the horizontal distances 
from the points to the line shall be a minimum, it is possible to de¬ 
duce a second regression line (the regression line of x on y) whose 
equation, referred to ( x, y), is 

(7) x f — —ry r • 

(Ty 

Note that (7) cannot be obtained by solving for x* in (6). The 
two regression lines will coincide if, and only if, r = ±1. From 
the equations of the regression lines it is evident that if r > 0, an 
increase in the one variable tends to accompany an increase in the 
other; if r < 0, an increase in the one will be accompanied by a 
decrease in the other. 

Equations (6) and (7) may be expressed, if desired, in terms of the 
original variables x and y instead of the deviations x f and y r . It is 
obvious that they may be written as 

( 8 ) y-y = r^(x - X) 
and 

(9) x - x = r^f(y - y) 

when referred to the origin of x and y. 

Equation (8) may be used to estimate values of y corresponding 
to designated values of x. Similarly, from equation (9) we may 
estimate x for designated values of y. It would be appropriate to 
use (8) as a predicting equation when the variation in y is caused or 
controlled by variation in x; (9) would be used when the variation 
in x is caused or controlled by the variation in y. 

The quantity mi — r(cr y / cr x ) is called the regression coefficient of y 
on x 1 being the variation in y corresponding to a unit change in x . 
Likewise, m 2 = r(Gx/<r y ) is called the regression coefficient of x on y . 
Thus the numerical value of r is given by (mi?^) 1 ^ 2 but its sign must 




Correlation Theory 


167 


be determined otherwise. The following quotation from Snedecor 1 
sheds light on the distinction between regression and correlation. 

The point of interest here is that r is the geometric mean of the two regression 
coefficients. In ordinary units of measurement, therefore, r is an average of the 
two regression coefficients used in (i) estimating y from x and (ii) estimating 
x from y. This serves to clarify the relation of the two coefficients, correlation 
and regression, in measuring relationship. The latter is the appropriate one if 
one variable, y , may be designated as dependent on the other, x. Values of y 
may be partly controlled or caused by x , as when the available amounts of some 
glandular secretion cause differences in the sizes of organisms. Or, y may be 
subsequent to x, as weight gain in nutrition experiments follows the measurement 
of initial weight. In such cases, the regression of y on x is usually the statistic 
that furnishes the information desired. It is then appropriate to attempt to 
estimate the value of y from a knowledge of the corresponding value of x. Cor¬ 
relation, on the other hand, is the appropriate measure of the relation between 
two variates like statures of husband and wife. The two heights are known to 
be associated through some complex of social and biological causes, but neither 
may be looked upon as a consequence of the other. In this sense correlation 
is a two-way average of relationship, 
while regression is directional. Of course, 
there are many variables whose relation¬ 
ship may be studied by means of either 
correlation or regression, or both. It is 
necessary only to keep clearly in mind 
the character of the relation being con¬ 
sidered. 

Geometrically, mi is the slope of line (8) and l/m 2 is the slope of 
line (9). The two lines intersect at (x, y). 

Exercises 

1. Derive the equation of the line of regression of x on y as suggested above. 

2 . Find the equations of both lines of regression for Exercise 6 (page 164), and 

plot them. A ns. y = .888x — .64 
x = .825y + 8.55. 

3. Using the appropriate equation, find the estimated values of y corresponding 

to the given values of x, for Exercise 6 (page 164). 

4 . Given the following results for the heights and weights of 1000 men students: 

y — 68.00 in., x = 150.00 lbs., r — .60, 

c y = 2.50 in., ox — 20.00 lbs. 

John Doe weighs 200 lbs., Richard Roe is five feet tall. 

Estimate the height of Doe from his weight, and the weight of Roe from 

his height. 

Ans. Doe’s height = 71.75 in. 

Roe’s weight = 111.6 lbs. 

1 Statistical Methods , Collegiate Press, Ames, Iowa. 





168 


Mathematics of Statistics 


6 . (a) Given the following: 

X> = 150,000, = 22,725,000, "£xy = 10,522,500, 

Ev = 70,000, Ev 2 = 4,936,000, N = 1000. 

Find x, y, <r», <r y , r, and the lines of regression. 

(b) Suppose the data in (a) refer to the weight in pounds (x) and the height 
in inches ( y ) of a sample of 1000 policemen. Suppose Paul Private weighs 
160 pounds and Saul Sergeant is 6 feet tali. Estimate the height of 
Private and the weight of Sergeant. 


6. The Standard Error of Estimate. The average concentration 
of the points around the regression line of y on x may be measured 


by the expression —X^d 2 where d is the difference between an ob¬ 


served y and the y obtained from the regression line. The value of 
—]£d 2 w iH be denoted by £ y 2 , and S y is called the standard deviation 


of the errors of estimate, or more briefly the standard error of estimate. 
The errors of estimate are the deviations of the observed values of y 
from the corresponding estimated y’s. Or to describe them another 
way, they are the deviations of the sample y’s from the assumed 
population y’s. It can be shown that S y 2 = <r y 2 ( 1 — r 2 ). To prove 
this we may write the sum of the squares of the deviations in the 
form: 


NS y 2 = ]C(V “ r — aA = X V ' 2 “ 2 r — ^x'y' + r 2 ^- ^x' 2 

\ O’x / &x 

= Na y 2 ~ 2 NrW + Nr 2 a y 2 = Na y 2 (l - r 2 ). 


Hence, we have 

(10) S 2 = V( 1 - r 2 ) 


and 

(10a) S v = a v { 1 - r 2 ) 1/2 . 

An analogous consideration of the differences between the x’s and 
the regression line of x on y, gives for the square of the standard 
error of estimate of the x’s 


(11) S x 2 = <r* 2 (l - r 2 ). 

6. Properties of the Correlation Coefficient and Standard Error 
of Estimate. Certain properties of r may now be deduced. It is 



Correlation Theory 


169 


obvious from (10) that jr| ^ 1 because both the left member and 
< t v 2 are positive or zero. Therefore, 

-1 ^ r S 1 . 

If the points all lie exactly on the regression line, the left member of 
(10) vanishes and r = ±1. There is then said to be perfect linear 
correlation, since the relation between x and y is given exactly by a 
linear function. A large numerical value of r means that the regres¬ 
sion lines are close to coincidence and the points in a scatter diagram 
cluster closely around the regression lines. 

When the regression lines (8) and (9) are expressed in standard 
units, they become respectively 

( 12 ) t y = rt x 
and 

(13) t X = T*ty Or ty = tx 

r 


where 


t x 


x — X 

- and t y 

<Tx 



In this form we see at once that as one variable t x increases, the other 
variable t y increases (or decreases) to an extent that depends upon r. 
Thus r measures co-variation in the variables when they are ex¬ 
pressed in comparable units and when regression is linear. 

In standard units, r is the slope of line (12) and 1/r is the slope of 
line (13). When r = 0 , the regression equations become t y — 0 and 
t x = 0 in standard units or y = y and x = x in the original units. 
These are also the equations of the coordinate axes. Therefore, 
when r = 0 the regression lines are per¬ 
pendicular to each other and coincide 
with the t x and t y axes. When r = 1 
the regression equations become identi¬ 
cal and the two lines coincide in quad¬ 
rants I and III. Similarly, when r = — 1 
they coincide in quadrants II and IV. 

In each case the coincident lines bisect 
the quadrants if the equations are ex¬ 
pressed in standard units, but not otherwise unless <r y = a x . The 
angle 0 between the regression lines varies from 0° to 90° as r varies 
from one to zero. 






170 


Mathematics of Statistics 


When there is no correlation between x and y then r = 0, and the 
variables are said to be independent in the statistical sense. On the 
other hand, when r = 0, it is not necessarily true that the variables 
are statistically independent. Indeed there may be a high correla¬ 
tion 1 with non-linear regression when r = 0. (Non-linear regression 
will be considered in §21.) Incidentally, the phrase “ independent 
variables ” in the statistical sense should not be confused with the 
phrase “ independent variables ” which is used in the ordinary sense 
of analysis to designate the variables on which a specified function 
depends. However, the twojisages, though quite distinct, are not 
fundamentally contradictory, since functional dependence can be 
regarded as a limiting case of statistical dependence. 

For an appreciation of the use of S y in passing judgment upon the 
precision to be expected in estimating values of y by means of the 
regression equation of y on x, it is instructive to consider simulta¬ 
neously the meanings of (8) and (10a) as \r\ varies from 0 to 1. When 
r = 0, (8) becomes y = y which means that the best estimate of y 
for any value of x is the mean of the ^-distribution. In other words, 
knowledge of x is of no value in predicting y. When r = 0 in (10a), 
S y = a v . This is to be expected since the dispersion S y about the 



Fig. 34— For a Fixed Value of c Vi S y Decreases in Proportion 
to (1 — r 2 ) l/z as r Increases 

line y = y is the same as the dispersion <r y of the given y ’s about their 
mean. But as |r| increases from 0 to 1, S y decreases from a v to 0. 
Graphically, the meaning of this improvement in S v in comparison 
with <r v , as r increases, is shown in Figure 34 where parallel lines are 

1 See H. L. Rietz, On Functional Relations for which the Coefficient of Correlation 
is Zero. Journal American Statistical Association, vol. 16, 1919, pp. 472-476. 



Correlation Theory 


171 


drawn at a vertical distance of S v on either side of the regression line 
RR'. For a given value of \r\ this strip encloses the average dis¬ 
persion about the line. The strip on either side of y = y at a dis¬ 
tance of <x y from it encloses the average dispersion about the line when 
r = 0. As \r\ increases from 0, the line rotates from the horizontal 
position of y = y to the terminal position it would have when |r| = 1, 
and at the same time S y decreases toward 0. 

Formula (10a) tells us that as |r| thus in¬ 
creases, S y decreases from a y in propor¬ 
tion to (1 — r 2 ) 1/2 . 

A similar analysis could be made con¬ 
cerning the line of regression (9) of x on y 
which rotates from the vertical position 
x — x when \r\ = 0 to meet and coincide 
with line (8) when \r\ — 1. As line (9) 
rotates, S x decreases from <r x to 0 in pro¬ 
portion to (1 — r 2 ) 1/2 as r increases. 

As \r\ —► 1, (12) and (13) rotate toward each other at equal angular 
velocities. When they are coincident their slope is ±1. Lines (8) 
and (9) rotate at angular velocities which are proportional to m\ 
= tan a and m 2 = tan 0, respectively, where mi and m 2 are defined 
in §4. Their slope at coincidence is ±<r y /<r x . For line (12) it can 
be shown that 

(14) = 1 - r 2 

where 5 is the difference between an observed value of t y and the 
ordinate obtained from (12) for the corresponding value of t x . Thus, 

= 1 - 2r 2 + r 2 

= 1 - r 2 . 



This result would also be apparent from the derivation of (10) since 

S = d/c v . 

It is obvious from (14) that the maximum value of is unity. 




172 


Mathematics of Statistics 


Therefore, adopting 

(15) 1- ^> 


as a measure of goodness of fit, we see from (14) and (15) that r 2 
is a measure of the goodness of fit of (12) to the points of the scatter 
diagram expressed in standard units. By an analogous argument a 
similar conclusion concerning (13) can be made. 

7. Further Discussion. Given a set of N pairs of x and y cor¬ 
related values. Suppose the necessary constants are evaluated to 
obtain the regression equation (8). Then if the given values of x 
are substituted in this equation, a set of estimated y’ s, say E y } will be 
obtained. The mean, E y , of these estimated y’s is the same as the 
mean of the observed y’s. The proof is as follows. From (8) we have 


E y =y + r—(x — x). 
c x 


Then 


N 


1 * 


eV = + r ~ n'L&i ~ x). 

IS 1 G x IS 1 


N 


But — x) = 0 by Theorem VI, Chapter III. So E y = y. 

i 

We now state the following theorem. 

Theorem II. The variance , a Ey 2 , of the estimated y’s equals r 2 a y 2 . 
Proof: By definition, 

1 N 

GEy 2 ~ -^YhiEyi ~ El}) 2 - 

i 

From the above discussion, ( E yi — Ey) is the same as {yi — y) which 
is given by (8). So 


1 V r /r 12 

OEv 1 = — 2 r—{xi - x) 

N 1 L G x J 


N 




Hence 
(16) 

From this theorem and (10) we obtain 


<tev 2 = r 2 a y 2 . 


(17) 


Sy 2 


nr 2 


<Tev 2 > 




Correlation Theory 


173 


This relation helps to clarify the meaning of r and S y . It is con¬ 
ventional to call <r Ev 2 the variance in y which can be explained from 
knowledge of x; that is, which the regression of y on x accounts for. 
Therefore, (17) shows that S y 2 is the variation in y after the accom¬ 
panying variation in x is duly discounted. S y 2 is sometimes called 
the residual variance because it measures the variation in the depend¬ 
ent variable y which knowledge of x fails to account for. This rela¬ 
tion can be depicted geometrically by the sides of a right triangle. To 
standardize the representation we can take <r y = 1 as the diameter of a 
semi-circle within which is inscribed the right triangle, as in Figure 35. 



In the figure, cos 0 = <r E y/<r y . So from (16) we have cos 0 = r. 
The particular values of 0 in the figure, found from a table of cosines, 
are 0 = 36° 52' when r = .8, and 0 = 25° 50' when r — .9. 

Theorem III. The correlation between actual and estimated values 
of y is the same as that between x and y. 

Proof: We are to show that 

- vev 

y 

reduces to one of the formulas for r. Substituting the values for 
E y , E y, <r Ey , into the above expression and simplifying, we obtain (3). 
The details of the proof are left to the student as an exercise. 

8. Coefficient of Alienation. A measure of the failure to improve 
estimates of y from knowledge of correlation is given by 

(18) k = (1 - r 2 ) 1/2 . 

It is sometimes called the coefficient of alienation. Incidentally, it 
is interesting to observe that the functional relation between k and r 
is shown, graphically, by a semi-circle of unit radius, i.e., 

f(r) = (1 - r 2 ) 1/2 . 





174 


Mathematics of Statistics 


The formula 

jfc' = 1 — (1 — r 2 ) 1 ' 2 

may be called the improvement factor because it shows the decrease 
in Sy/vy as |r| increases. It is clear that 

fc 2 = and k' = 1 — k. 

N (Ty 2 

Table 31 gives 1 values of k and k' for values of r. With no knowl¬ 
edge of correlation, the best estimate of an individual y is y. Values 
of fc' for assigned r’s show how much better than this guess is the 
estimate of an individual y value with knowledge of correlation. 
For example, when r = .5 the column headed k in Table 31 shows 
that the standard error S y is about 87% of <r y . Or, from the fc' 
column, S y has been reduced only 13% from what it would have 
been if y had been used for prediction purposes. The third column 
thus shows how the prediction value of r varies with r. Thus as \r\ 
decreases from 1 to .8, S y /a y increases from 0 to 60%. Or from 
another point of view, as \r\ increases from 0 to .8, the error of 
estimate is improved by only 40%. A correlation of r = .9 permits 
prediction of individual y’s only 56% better than a mere guess based 
on the mean. 

Table 31 — Values of r and the Corresponding Values of k and k ' 


r 

. 1 

k f 

.1 

.995 

.005 

.2 

.980 

.020 

.3 

.954 

.046 

.4 

.917 

.083 

.5 

.866 

.134 

.6 

.800 

.200 

.7 

.714 

.286 

.8 

.600 

.400 

.9 

.436 

.564 

.92 

.392 

.608 

.94 

.341 

.659 

.96 

.280 

.720 

.98 

.198 

.811 

1.00 

0.000 

1.000 


1 Constructed from a table of sines and cosines. Letting r = cos 6, sin 6 = 
(1 - cos 2 0) 1/2 = (1 - r 2 ) 1/2 . 




Correlation Theory 


175 


It is fairly obvious that we cannot, with any considerable degree 
of reliability, predict from ordinary values of r an individual y for an 
assigned x . However, with a large N, we can give a very reliable 
prediction of the mean of y values that correspond to an assigned 
value of x. This can best be explained from a correlation table 
which is used when N is large and which will be explained in the 
next section. 

Exercises 

1. Given the following correlated data. 


X 

8 

6 

4 

7 

5 

y 

9 

8 

5 

6 

2 


(а) Compute the correlation coefficient. 

(б) Find the regression line of y on x. 

(c) Find the estimated values of y corresponding to the given values of x . 

(< d ) Compute the standard error S v of predictions in two different ways. 
Ans. 

2.4 / - 

r * — ■ = .69, nh = r— «= 1.2, S y = V 3.12 - 1.76. 
V 2 V 6 V 2 

Note. In practical work, it is never worth calculating a correlation co¬ 
efficient for so few observations. These fictitious data are given solely as 
an exercise on which the student can test his knowledge of the methodology. 

2. Prove that the ratio of the variance of the estimated y’s (taken about their 

mean) to the variance <t v 2 of the given y’s is equal to r 2 . 

3. If S y 2 /a y 2 = 1 — r 2 is the percentage of the total variance of y uncontrolled 

by knowledge of x, what is the remaining percentage, determined by, or 
calculable from knowledge of x? 

4 . What equation is the equivalent mathematical statement for the following words: 

If the respective deviations in each series, x and y , from their means 
were expressed in units of standard deviations — that is, if each were 
divided by the standard deviation of the series to which it belongs — and 
plotted to a scale of standard deviations, the slope of a straight line best 
describing the plotted points would be the correlation coefficient r. 

5. Given the standard deviations a x and <r v of two distributions of correlated 

variates: 

(a) What is the standard error in estimating y from x if r — 0? 

(b) By how much is S y in (a) reduced if r is increased to .25? 

(c) How large must r be in order that S y be one-half as large as in (a)? 

( d ) What must r be in order that S y be reduced to one-third its value in (a)? 

(e ) At what value of r is S y reduced to zero? 

(/) For any value of r, what is the ratio between the standard error of 
estimating y from x and the standard deviation of the ^-distribution? 






176 


Mathematics of Statistics 


6. Evaluate the following statements: 

(а) A correlation coefficient less than zero indicates an absence of linear 
relationship. 

(б) A correlation coefficient of r = .6 indicates twice as close relationship 
as a coefficient of r = .3. 

7. If all the points lie exactly on the regression line of y on x, show that S y 2 = 0 

and hence that r = ±1. 

8. Show that S y 2 may be computed by means of the relation 


NSy 2 


= T,y n - 


(Z>y) 8 

2>' 8 


9 . 


where the primes denote deviations from the means. 

(For Analytics students.) Show that the tangent of the angle from line (8) 
to line (9) is 


tan 6 


<Tx<r y f l — r 2 l 

<r, 2 + oy4 r J 


and from line (12) to line (13) is 


1 - r 2 
tan 6 = —-— • 

2r 

What is the value of 0 when r — 1, when r = 0? 

10 . The least-squares criterion of best fit requires that ]Ts 2 be a minimum, 
where 5 is the distance between the line and a point. Three cases arise 
depending on whether 
Case 7, 8 is measured parallel to the y-axis, 

Case Ilf 8 is measured parallel to the x-axis, 

Case 777, 8 is measured perpendicular to the line. 

We have seen that Case I yields line (12) and that Case II yields line (13). 
In Case III the line has no universely accepted name but it may be called 
the “geometrically best-fitting line.” 

(For calculus students.) For Case III prove the following: 

(a) In standard units, the equation of the line is 

t v = t x if r > 0 

and 

t y = -t x if r < 0. 

Solution . Let the equation of the required line be 


t v — mt x -f k. 


Then by analytics, 



_Iv ( + k ~ ty \ 
Tv \ Vl + to* / 

m 2 + k 2 + 1 — 2mr 
--- — • 


1 +m 2 




Correlation Theory 177 


To make this a minimum , first put k 2 — 0. Call the result f(m). 
Then 


f(m) = 

f'M = 


m 2 + 1 — 2mr 
1 + m 2 
2m2r_—2r 
(1 + m 2 ) 2 ? 


7 


/"(m) 


Amir (3 — m 2 ) 

(1 + m 2 ) 3 


The second derivative will be positive when m and r have the same sign. 
Since f(m) is a minimum when m — ±1, we are to take m — 1 when 
r > 0 and m = —1 when r < 0. 

(i b ) If r = 0, all lines (for which k 2 — 0) fit equally well. Hint. If r = 0, 
f(m) = 1. 

(c) = 1 — |r|. Hint . What is the value of f(m) when m - ±1? 

Note that \r\ — +r, if r > 0 and |r| = —r, if r < 0. 

(d) Goodness of fit is measured by |r|. 

(e) When r = .6 the fit is twice as good as when r — . 3 . 


9. Correlation Table. When the sample to be studied is large, 
it is more convenient to replace the scatter diagram by a correlation 


Table 32 


■ 

■ 

65 - 

69 

70 - 

74 

75 - 

79 


85 - 

89 

90 - 

94 

95 - 

99 

■ 

■ 

N. x 

y\ 

67 

72 

77 

E 

87 

92 

97 

f(.y) 

90-94 

92 




1 

2 

3 

1 

7 

85-89 

87 



1 

3 

8 

1 

5 

18 

80-84 

82 

4 

4 

6 

4 

9 

1 


28 

75-79 

77 

3 

3 

7 

6 

4 



23 

70-74 

72 

2 

3 

5 

6 

1 

1 


18 

65-69 

67 

3 

2 






5 

60-64 

62 

1 







1 


/(*) 

13 

12 

19 

20 

24 

6 

_J 

6 

100 


table. We may divide the xy -plane into rectangles of convenient 
size, and all points of the scatter diagram falling within any rectangle 
are thought of as being concentrated at the center of this rectangle. 






























































178 


Mathematics of Statistics 


A number is then written within the rectangle to designate the 
number of points at its center. A correlation table is therefore a 
two-way frequency table exhibiting the frequencies in each class 
interval. 

Suppose Table 32 is constructed in this way for a set of average 
daily grades (x) and final examination grades (y) of 100 students. 
When the data have been thus grouped into classes, the class marks 
are regarded as the variate values. Thus in Table 32 there are 9 
students whose daily grades are 87 and whose final examination grades 
are 82. The last column labeled f(y) represents the distribution 
of y variates and the last row labeled fix) represents the distribution 
of x variates. A correlation table is thus a bivariate distribution. 
In the above table the width of the class interval is the same for x 
and y, but of course this is not generally the case. 

10. Notation. In order to compute r from a correlation table it 
will be necessary to develop new notation. Since we are now dealing 
with frequencies both in the ^-direction and the ^-direction, we will 
distinguish between them by f(x) and f(y). To be sure, this has 
the disadvantage of being the same symbol as that for function, but 
from the context no ambiguity should arise. 

Generalizing, a correlation table is of the following form: 


x 


\se 

2/\ 

*1 

x 2 


_ 

T 

- 

- 

X n 

m 

,y ) 

Vn 










Vn-1 




f(x,y 

) 




2 fU 

X 

1 










-i 





(x,y) 





1L 










Vi 








_ 


fix) 



:/ui 

l _ 

== 





~N I 

mU 

^22 f(x,y) 

X y 


22 /<*.») 
yx 


The rectangles containing the frequencies are called cells. The 
frequency in a typical cell is denoted by fix, y), meaning the frequency 





Correlation Theory 


179 


in the cell whose coordinates are x and y , where x and y are the 
mid-values of the class intervals. Both columns and rows are sub¬ 
distributions of the total frequency N. Each column is a frequency 
distribution of y’ s corresponding to a mid-x value. Similarly, each 
row is a frequency distribution corresponding to a mid -y value. 
The sum along any row is denoted by £f(x, y) } being the sum of 

X 

the frequencies in the (x, y) cells in the ^-direction. Since the 
marginal total for any row is the total frequency corresponding to 
a given value of y> it is therefore written in the column headed f(y). 
Thus, in Table 32, for y — 92, 

£/(*, y) = £/(*, 92) = 1 + 2 + 3 + 1 = 7. 

X X 

Similarly, y) denotes a summation in the ^/-direction of all the 

V 

entries in a column, corresponding to a fixed value of x, so it denotes 
an entry in the bottom row which contains the f(x) frequencies. 
Thus, for x = 67 

£f(67, 2/) = 4 + 3 + 2 + 3 + l = 13. 

V 

Summarizing, 

( 19 ) Z/O, y) = f(y)i Z/0, y) = /(*)• 

x V 

With regard to N , we may obtain it from a correlation table in 
three ways: (1) by adding the entries across the rows and then 
totaling the resulting sums in the marginal column labeled f(y); 
(2) by adding the entries along the columns and then totaling the 
results in the marginal row labeled f(x); (3) by adding the entries 
in the cells in any order whatsoever. Hence, the following notation, 

(20) Z£/(s, y) = Z£/(x, y) = Z/(z, y) = N, 

V x x y x>y 

will denote, respectively, the above-named procedures or orders in 
summing. From (19) and (20) we have 

(21) N = £ f(y ) = £/(*) - £/(*, y). 

iV x x % y 

We may call f(x) and f(y) the marginal distributions of x and y, 
respectively. A correlation table with cell frequencies f(x, y) 
uniquely determines the marginal totals fix) and f(y). The con- 




180 


Mathematics of Statistics 


verse, however, is false. For example, we might replace the four 
cell frequencies in the upper right-hand corner of Table 32 by the cell 


frequencies 



without disturbing the marginal totals. 


11. Means and Variances. We will now express the means in 
terms of this notation, taking first the mean of x’s. From the funda¬ 
mental definition, we must multiply each x by its corresponding 


frequency in the cells and sum the results, taking the products in any 
order whatsoever. Hence, 


x = (x, y ). 

N x , y 


This may also be written 

* - »> - s2>2>. v ) - i&fbO- 

JS x y IV x y IV x 

Observe that the x may be moved to the left of ^ in the second 

y 

expression because x is treated as a constant in a summation per¬ 
formed with respect to y . 

Similarly, we have, 

y = y) = v) 

Nx~y iV y x 

= y ) = wSy/(if)- 

Ny X My 

The student will observe that the last expression for the mean in each 
case is identical with that given for a frequency distribution of one 
variable, when allowance is made for the necessity of distinguishing 
between variables. 

Any column is an x array of y’s, so the symbol y x is appropriate 
for the mean of a column. Similarly, x y denotes the mean of a y 
array of x’s, i.e., of a row. We may now state the following theorem. 

Theorem TV. The mean y for the whole table (in the y direction ) 
is equal to the mean of the values y x for the several columns when each y, 
is weighted with the frequency in that column. 

Proof: We are required to show that 

^Hf(x)y* = y 




Correlation Theory 


181 


where 


V* = 77-7 Hyffr, v)- 

J w v 


Upon substituting in the first equation the value of y x as given by the 
second equation, we have 


y) = r;Hyf(.x, y ) = y- 

IV x y 1V x, y 

It is suggested that the student state and prove a similar theorem 
concerning x. 

In this new notation, the definitions of the variances become 


ff * 2 = - xYf{x, y) 

IV x , y 

= ^2> 2 /(s) - * 2 ;. 
tfv 2 = Tf'Zliy - y) 2 f(x> y) 

IV x,y 

= ^Ly^Ky) - y 2 - 

IV y 


Exercises 

1 . Evaluate the following expressions in Table 32. 
(a) For x = 82, 


Zfix, y), 

Hvf(x, y), 

/(*), 

v*- 

y 

(b) For y = 87, 

Z/(*. y\ 

X 

y 

T,xf(x, y ), 

f(y). 

Xy. 


2. Refer to Table 27 (Chapter V) and let x be the number of a column. Express 
the answers in the third and second lines from the bottom of the table in 
terms of the notation of this section. Thus for x = 1, 

V* = 7 r£yf(x, y) = J[85 + (75) 2 + (65) 2 + (55) 2] = 67.86. 
fix) y 7 

12. Computation of Means. Just as in the case of a one-way 
frequency distribution it was found convenient to choose an arbi¬ 
trary origin and take the class interval as the unit, so we now do 
likewise. Let 

u = 7 ( x ■— x o) ; i.e 
h 


( 22 ) 


x = uh + x 0 . 



182 


Mathematics of Statistics 


Hence, 

(23) x = uh + x o 

where u = -z^,uf(u). 

iV u 

Likewise, let 

(24) v = (y — yo ); i.e., y = vk + ?/o, 

whence 

(25) 2/ = vk + Vo, 

where v = r-^vf(v). 

iV ® 

Then a suitable forru for computing the means of the z's and y y s 
is as follows: 



whence 5 - 82 + 5(-.28) = 80.6 

i 1 

v = —£>/(*>) = .54, 

whence 5 — 77 + 5(.54) = 79.7. 

In the table f(v) = f(y) and f(u) = /(x) because u and v are merely different ways 
of describing the cells but in no way change the frequencies in those cells. 




Correlation Theory 


183 


13. Computation of r. In the expressions of §10 and §11 the 
( u, v) coordinates could have been used instead of {x, y). The use of 
the former simplifies the computation of r. A preliminary discussion 
of certain expressions will help in understanding the formula for r. 
Let us consider first the following expression: 

(a) »)• 

W, V 

This means: multiply the / in each cell by the u and v coordinates of 
that cell and add the results, proceeding from cell to cell over the 
whole table in any order whatsoever. But it may be more con¬ 
venient to proceed in a definite order, say down the columns. Then 
(a) becomes 

(1 b ) ESWK v) = v )- 

u V U V 

The expression T \f(u, v) in the right member of ( b ) means: for 

V 

any u ( i.e., for any column), multiply each / by its own v and add 
the results. Let us denote this sum by 7. Then the right member 
of {b) means: multiply the V for each column by the u of that 
column and add the results, proceeding from column to column 
(i.e., summing in the w-direction). We may also obtain the same 
result as in (a) by proceeding along the rows. Thus (a) may be 
written 

(c) EE M|j /( u )») = »)• 

v u V u 

The expression y]uf(u, v ) means: for any v {i.e,, for any row), 

u 

multiply each / in the row by its own u and add the results. If we 
call this sum U , then the right member of (c) means: multiply 
the U for each row by the v for that row and add the results, pro¬ 
ceeding from row to row {i.e., summing in the ^-direction). 

We are now ready to derive the formula for r. 

Since we are now dealing with a frequency distribution, the funda¬ 
mental definition of r becomes 


(26) 


r — 


~E(z - %)(y - y)f(z, y ) 

iV X, V _ 


From (22) and (23), we have 

{x — x) = h{u — u), 




184 


Mathematics of Statistics 


and from (24) and (25), 

(y ~ y) = k(v — *0- 

Since (x, y ) and (u, v ) are merely different notations for the same 
cell, we have 

f(x, y) = f(u, v). 

For computing purposes, the standard deviations are defined as 
follows: 


(27) 
where 

(28) 


{ 


(T x — JlC u 
CF y ~ kefy 


■" “ “*] 

• ■ - ’■] 


Therefore, (26) becomes 


r = 


(u - u)(v - v)f(u, v) 

•iV u, v 


e u e v 


If now we let 

u = £u/(u, v) and V = v), 

U V 

then since 

!>/(«, V ) = '£ l vJ2uf(u, v ) = ^ U H V K U , v), 

U,v V U U V 

the above expression for r may be written in either of the following 


ways: 

r --- 

(29) 


(Tu(Tc 


s?° 1 ' - 00 


CTuO't, 




Correlation Theory 


185 


The fact that 

2 >>u 

V u 

serves as a cheek in the table. 

The above procedure is illustrated in Table 35. 


Table 35 — Computation of r for Data of Table 32 



u 

-3 

-2 

a 

D 

1 

2 

3 

/(«) 

vf(v) 

v 2 f(v) 

U 

vU 

V 

m 

67 

72 

77 

82 

87 

92 

97 

3 

92 




1 

2 

3 

1 

7 

21 

63 

11 

33 

2 

87 



1 

3 

8 

1 

5 

18 

36 

72 

24 

48 

1 

82 

n 

D 

6 

4 

9 

1 


28 

28 

28 

-15 


0 

77 

3 

3 

D 

D 

a 



23 

D 

D 

-18 

0 

-1 

72 

2 

3 

5 

D 

a 

1 


m 

-18 

18 

-14 

14 


a 

3 

2 






5 

-10 

20 

m 

26 

-3 

62 

1 







1 

-3 

9 

-3 

9 

/(w) 

13 

12 

19 

20 

24 

6 

6 

100 

54 

210 





-24 

—19 

0 

24 

12 

* 18 

-28 

jy 

V 


a 

48 

19 

a 



54 

286 

i 

-7 

-3 

3 

7 

30 

11 

13 


uV 

21 

6 

-3 

0 

30 

22 

39 

© 


Explanation: The table is self-explanatory except possibly the U and V entries. 
Recalling that U — £ uf(u, v), the first entry in the U column is obtained from 

u 

the sum of the following products: 0-1 + 1-2 + 2-3 + 3*1 = 11; the second 
entry from —1*1 -f- 0*3 + 1-8 *h 2*1 + 3*5 = 24. Since V = Y'vf(u. v) the first 

V 

entry in the V row is obtained from 1*4 + 0*3 H-1*2 H-2*3 H-3*1 = —7. 

Similarly, for the other entries. 






































































































186 


Mathematics of Statistics 


Computations: 

*.* = - a 2 = 2.86 - (-.28)2 

= 2.7816. 

<r„ = V2.7816 = 1.67. 

-V 2 = 2.10 - (.64)2 
N 

= 1.8084. 

<r v = V1.8084 = 1.34. 


Therefore from (29) we have 


r 


1.15 — ( —.28)(.54) 
(1.67)(1.34) 


= 0.58. 


14. Sign of r. Grouping Errors. It should be observed that the 
sign of r depends on the choice of the positive direction along each 
coordinate axis. In Table 35 the origin of reference is chosen so 
that the data occur in the first quadrant and the directions on the 
( x, y )-axes are the conventional ones. These directions were pre¬ 
served in changing to ( u , v ) coordinates. If we had reversed the 
direction of the p-axis by labeling the y values larger than y = 77 
by v = —1, — 2, —3, and those less than y = 77 by v = 1, 2, 3, the 
sign of r would be changed. But if the directions of both u and v 
were reversed, the sign of r would be unchanged. 

When N is small, say less than 100, and the data are grouped into 
cells, grouping errors are introduced. In general, the fewer cells 
used, the greater the errors. These may be corrected, in part, by 
applying Sheppard's corrections to <r u and a v . However, this will 
not be insisted upon in this course. 


Exercises 

1. By equation (29), show that r is independent of the choice of origin and of 

the units of measurement. 

2. In Table 35, evaluate the following sums: 

Yf(u, 2), E/(2, »), 2>/(“> 1), X>/(-2, v), IZ®/(n, v), v) 

u V U » u v u * v 

— E/(“. V) if V = 0. 

f(v) u 

3. Derive (29). 

4. For the following data, find r and x, y, a x , <r v . Note that Xo, yo, h, and k 

do not need to be determined to compute r, but are required for the 
means and standard deviation of £ and y. 





Correlation Theory 


187 


Heights and Weights of 200 Freshmen 


(Heights to Nearest X V Inch; Weights to Nearest j Pound) 


\ X 

y\ 

90- 

99.5 

100- 

110- 

120- 

130- 

140- 

150- 

160- 

170- 

180- 

190- 

200- 

209.5 

f(v) 

76- 

77.9 




1 









1 

74- 







1 

1 

1 

1 



4 

72- 




1 

1 

1 

4 


1 




8 

70- 



1 

2 

6 

7 

6 

2 

1 

2 

1 

1 

29 

68- 



2 

8 

17 

8 

9 

2 

1 

1 

1 


49 

66- 



8 

16 

14 

13 

6 

2 

1 



1 

61 

64- 


3 

8 

7 

7 

3 

3 

1 

1 




33 

62- 

1 

4 

1 

7 

1 








14 

60- 













0 

58- 

59.9 


1 











1 

M 

1 

8 

20 

42 

46 

32 

29 

8 

6 

4 

2 

2 

200 


Ans. x = 138.45 lbs.; y = 67.82 in. 


<r* = 19.6 lbs.; <r„ = 2.8 in. 
r = 0.48. 

15. Regression Lines for a Correlation Table. The data of a 
correlation table may be thought of as dots lying many deep at the 
centers of the several cells. There are, of course, f(x, y) of these in 
any cell whose coordinates are (x, y), and fix) is the total number of 
dots in a vertical column whose coordinate is x. Suppose now we 
replace all the data in each column by an equal number of data con¬ 
centrated at the mean of that column. If we denote the ordinate of 
this mean point by y x , we have 

(30) y x = TTwllyfix, y). 

J\ x ) v 

Hence, y x f(x) represents the totality of all the values in a column. 






188 


Mathematics of Statistics 


For each of the columns there will be a value of (30). Taking the 
hypothesis that the mean points of the several columns lie approxi¬ 
mately on a straight line y x = mix + k, we may find mi and k under a 
least-squares criterion of approximation. If, in applying the criterion, 
the square of the difference between the observed mean, 0 y x , and the 
computed mean, c y X) for each array, viz., (y x — m y x — k) 2 , is weighted 
with the number fix) in the array, it turns out that we get the same 
values for mi and k which we obtained when we fitted the regression 
line of y on x to all its points. 

In proving this, the student of calculus 1 would have an easy task 
in obtaining the normal equations: 

— mix — k)f(x) = 0 

X 

23(5* — »*i* - k)xf(x ) = 0 

X 

whose simultaneous solution yields the desired values of mi and k. 
Expanding (31), we have 

^jyxf{x) - miY^xf(x) - kJ2f(x) = 0 

X X X 

— rni%2x 2 f(x) — kY,xf(x) = 0 . 

* XX 

Since 

HVxf(x) = "EJjyfix, y ) = Ny, 

x x y 

and 

Jlyxxf(x) = 2>2 Zvfix, y) = 2 Zxyf(x, y ), 

x x y x,y 

li 

equations (32) become 

f Ny — miNx — Nk = 0 

(33) | Y,xyf(x, y) — mi ^2x 2 f(x) — kNx = 0. 

x,y x 

1 Differentiating partially J^f(x)(y x — m x x — k) 2 with respect to m and k 
respectively, and setting the results equal to zero, yields equations (31). Instead 
of differentiating this expression one may expand it, regard the result as 
a quadratic both in m and k, and use the Theorem of §3, Chapter VII, to 
obtain (31). 






Correlation Theory 


189 


Solving (33) for m\ and k we find 

k = y — m ix 

L xyf{x, y) — Nxy 

_ x, y _ _ Tffp 

1 - NX 2 <r. 


and the equation of our line becomes 


that is 
(8a) 


- &v . &v _ 

y x — r—x + y - rx > 

X 


yz — y = r—(x - x). 
cr x 


Therefore, the best-fitting line for the means of the columns prop¬ 
erly weighted, and the best-fitting line for all the dots are one and 
the same straight line. But from the point of view of a correlation 
table, a regression line is to be regarded as the equation from which 
may be estimated the average of all the y ’s associated with a particular 
value of x. In other words, a prediction in the latter case professes 
to give only the mean result (Figure 36). 



Fig. 36 — The Line op Regression of y on x is the Best Fitting Line for 
the Means of the Columns 

16. Applications. The data of a correlation table are usually re¬ 
garded as a sample of the much larger class of similar data consti¬ 
tuting the universe. A regression equation calculated from a limited 





190 


Mathematics of Statistics 


but representative sample may give valuable estimates of the average 
values of y in the universe associated with designated values of x. 

Let us consider the data of Table 36. Suppose a personnel man¬ 
ager in charge of hiring employees of a manufacturing plant has 
instituted a system of mental tests for applicants, and has gathered 
these data showing the relationship between the standing made by ap¬ 
plicants on their mental tests and their productive ability when meas¬ 
ured according to a certain standard of production after they are hired. 


Table 36 



X 

22.5 

27.5 

32.5 

37.5 

42.5 

47.5 

52.5 

57.5 

fiy) 


y 


-4 

-3 

-2 

-1 

0 

1 

2 

3 

fix) 

Xy 

125 

4 





2 

3 

2 


7 

47.5 

115 

3 



1 

3 

1 

4 

4 

4 

17 

48.1 

105 

2 



5 

7 

8 

11 

8 

7 

46 

45.9 

95 

1 


2 

1 

10 

12 

9 

8 

2 

44 

44.0 

85 

0 

1 

3 

12 

11 

7 

12 

7 

1 

54 

40.7 

75 

-1 

2 

1 

5 

6 

16 

8 

5 


43 

41.6 

65 

-2 

2 

5 

5 

8 , 

8 

6 , 

1 


35 

38.0 

55 

-3 

2 

3 

3 

4 

1 1 

1 



14 

33.2 

fix) 

=f(u) 

7 

14 

32 

49 

55 

54 

35 

14 

260 


y* 

67.9 

72.1 

81.9 

84.8 

85.7 

90.9 

95.6 

105.0 



Here x represents the grade made on mental test, and y the per cent, of standard 
in production. (See also Table 27.) The means of columns are denoted by y x , 
and the means of rows by x v . 

In order to demonstrate to the company’s management the con¬ 
nection between his mental tests and the productivity of the em¬ 
ployees he has hired, the personnel manager does the following: 

(1) Computes the coefficient of correlation between the two series; 

(2) Shows what the estimated productivity of employees would be 
whose grades in the mental test fell on the mid-points of the class 
intervals of the mental test data. 

The means of the columns and of the rows are given in the table. 
In addition, he obtains the following results: 

x — 42.17, <Ty = 17.41, r = .417, 

y = 87.31, c t x = 8.40, m x = r— = .864. 


Correlation Theory 

Therefore, the line of regression of y on x is 


191 


y x - 87.31 = .864(z - 42.17) 


or 

(34) y x = .864# + 50.88. 

This is the equation of the line which best fits the points which desig¬ 
nate the means of the columns (Figure 37). Hence, for an assigned 
value of x , equation (34) gives the value of y which is the expected 
mean of the column defined by the assigned value of x . The personnel 
manager is thus prepared to predict the productivity of applicants 



Fig. 37 — Means of Columns and Line of Regression 
of y on x for Table 36 


on the basis of their mental test grades. In other words, the regres¬ 
sion equation calculated from the records of those already hired may 
be used in selecting from future applicants those most likely to 
succeed. 1 

1 The critical reader may doubt if the value r = .417 is sufficiently large to 
warrant much confidence in (34) as a predicting equation. The question of 
reliability of predictions is discussed later. 





192 


Mathematics of Statistics 

Exercises 

1. Verify the value of r given for Table 36. 

2. Verify the means of the columns given in Table 36. 

3 . Using equation (34) show what the estimated productivity of employees 

in the factory referred to above would be whose mental test grades were 
22.5, 27.5, etc. 

4 . For Table 35, 

(а) Find the equations of the regression lines. 

(б) Locate the axes through the mean of the table and graph the regression 
lines. 

(c) Compute S y . 

6. As in Exercise 4 for the table on page 187. 

Ans. to (a), 

y x — .069a: + 58.3 
x y = 3.362/ ~ 89.4. 

17. S y for a Correlation Table. For ungrouped data we have 
defined S y as a measure of the clustering of the data around the 
regression line, and have observed that it is called the standard error 
of estimate. In order to understand what S v has to do with “ esti¬ 
mates it is necessary first to consider its meaning in a correlation 
table. Let us denote by s v . x the standard error about the regression 
line in the array of y’s at x . Thus we have 

( 35 ) s** 2 = Trrllioy - eyx) 2 f(x, y) 

f\ X ) V 

where oy denotes an observed y value and c y x denotes the value 
obtained from the regression line for that column. Thus, for the 
column headed 32.5 in Table 36 we obtain the computed value 
y x by substituting x = 32.5 in (34) whence we find y x = 78.96. 
To evaluate s y . x 2 for this column we find the square of the deviation 
of each of the 32 values of 0 y from 78.96, add the results and divide 
by 32. Extracting the square root of the result we find s„.* = 15.96. 
Moving along the regression line suppose we have computed an s y . x 2 
for each array of y’s and averaged the results. It is interesting to 
learn that this average is S y 2 . This is stated more precisely in the 
following theorem. 

Theorem V. The arithmetic mean of the values of s y . x 2 for the several 
columns when each s y . x 2 is weighted with the frequency in that column is 
S y 2 = VC 1 - r 2 ). 

Proof: Using (35) we have 

jyS/te)*** 2 = jfE.'IlUy - cy x ) 2 f(x, y). 




Correlation Theory 


193 


Substituting the value given by ( 80 ), §15, in the right member of the 
above identity we have 

- jy +rj(x - f)|J/(x, y ), 

that is 

t;X)(( 2 / ~ 5) ~ r — (x - f)| f(x, y) 

Nx,y[ (Tx J 

which reduces to <r 1/ 2 (l — r 2 ). It is left as an exercise for the student 
to show this. 

For Table 36 we find S y — 15.83. In Figure 38 the parallel lines 
on either side of the regression 
line RR' are drawn at a vertical 
distance of ±S y from it. They 
describe the average limits of 
scatter above and below the re¬ 
gression line. 

To connect S y with the reli¬ 
ability of predictions it is neces¬ 
sary to introduce the concept of 
a correlation surface. Indeed, 
a knowledge of the fundamental 
properties of a correlation sur¬ 
face is desirable for a wider outlook on correlation theory in general. 

18. Normal Correlation Surface. A correlation table may be 
idealized into a surface in somewhat the same way that a histogram 
is idealized into a frequency curve. The concept of a surface relates 
to the universe from which the observed data of the table may be 
regarded as a sample. Let the dimensions of the cells of a table be 
Ax and A y y and suppose columns are erected upon these cells with 
altitudes proportional to the frequencies in the cells. The result is 
a sort of solid histogram. Then as Ax —> 0, Ay —> 0, N —» °o, the 
tops of the columns approach as a limit a smooth surface which is 
called a correlation surface. Our discussion will be confined to the 
case where we may assume that this limit is a normal correlation 
surface. In discussing this surface it is convenient to let x and y 
represent deviations from the respective means and to let z = f(x , y) 
denote the frequency function representing the surface. Such a 
surface is shown in Figure 39. 





194 


Mathematics of Statistics 


Any section of this surface parallel to the 2 /z-plane is a normal 
curve and represents the distribution in a column at x. Similarly 
any section parallel to the zs-plane representing a row is a normal 
curve. The frequency in a cell is measured by that portion of the 
volume under the surface which lies over that cell. All those cells 
in which the frequency is a fixed value lie on an ellipse. That is, if 
contour lines are drawn on the surface joining the points of equal 
height above the base they will be ellipses. In other words, sections 
of the surface parallel to the xy -plane are ellipses. 


I y 



Fig. 39 — Frequency Surface for Correlated Variables 

We will digress here for a brief discussion of an ellipse. We may- 
think of an ellipse as a transitional figure between a circle and a 
straight line, as the circle flattens out. That is to say, the limiting 
form of an ellipse is a circle at one extreme of the flattening process 
and a straight line segment at the other extreme. The degree of 
flatness is called the eccentricity of the ellipse, and it is proved in 
analytic geometry that the eccentricity varies from zero in the case 
of a circle to unity when the ellipse degenerates into a line. All 
ellipses having the same eccentricity whatever their size have the 
same relative proportions and are therefore similar in form. 

The eccentricity of the elliptical contours of different normal cor¬ 
relation surfaces varies with the amount of correlation existing in 
the corresponding universe. A surface with narrow elliptical con- 




Correlation Theory 


195 


tours represents a universe in which there is high correlation, whereas 
if the variables are completely independent in the probability sense 
the contour lines are circles 
when the variables are expressed 
in standard units. If the vari¬ 
ables are not expressed in stand¬ 
ard units (and r = 0) then the 
contour lines may be ellipses 
but their major and minor axes 
will coincide with the x- and 
y- axes. When r 9 ^ 0 the axes 
of the ellipses make an angle 
with the z?/-axes, their major 
axis cuts quadrants I and III in the xy- plane if r > 0 (as in Figure 39) 
and quadrants II and IV if r < 0. 

19. Properties of Normal Bivariate Surface. The equation of a 
normal correlation surface is given by 

(36) f(x, y) = Ke~ p 

where 

_ 1 \ x 2 _ 2rxy y^_ 

2(1 — r 2 ) [<7 X 2 G X Gy Gy 2 

K = N + (2TTG x G y Vl — r 2 ), and x and y represent the correlated 
variables referred to their respective means as origin. 

By means of (36) an observed distribution may be fitted with the 
appropriate normal surface assuming that the sample might reason¬ 
ably have come from such a universe. This is accomplished by 
replacing <r x , g V) r, and N in (36) by the corresponding statistics 
calculated from the sample and taking the origin at the mean of the 
table. Let us assume that an observed distribution has been gradu¬ 
ated by such a surface and the theoretical cell frequencies obtained. 
The surface extends to infinity in the xy- plane but contour ellipses 
can be obtained which will enclose any desired percentage of the 
given frequency when these ellipses are projected orthogonally on to 
the £i/-plane. They are all concentric, similar, and similarly placed. 
Figure 41 represents such an ellipse, say the smallest one necessary 
to enclose all the given cells. The systems of perpendicular chords 
represent the columns and rows of the table. 

The graduated frequencies for each column are normal distri¬ 
butions whose means lie on the regression line of y on x and whose 






196 


Mathematics of Statistics 


standard deviations are in each case given by S y = a y ( 1 — r 2 ) 1/2 . 
To state the same thing in a slightly different way, an array of y ’s 

corresponding to a fixed value x\ 
of x is a normal distribution whose 
mean deviates from y by r(<T y /a x )xi 
and whose standard deviation is 
S y = 0^(1 — r 2 ) 1/2 which is inde¬ 
pendent of xi and therefore is the 
same for all such arrays. Simi¬ 
larly an array of z’s corresponding 
to a particular value y\ of y is a nor¬ 
mal distribution with a mean which 
deviates from x by r(<T x /<T y )y h 
and a standard deviation of 
S x = 0 x(l — r 2 ) 1/2 which is inde¬ 
pendent of yi and therefore is the same for all such arrays. A 
careful study of Figure 39 will help in understanding what is meant 
by these statements. 

When the means y x of the columns fall exactly on the regression 
line, s y . x becomes the standard deviation of a column and is therefore 
the same as S y . Theorem V states that S y 2 is an average of the 
values of s y . x 2 but when all the quantities being averaged have the 
same value, as they do in the ideal case of the normal surface, their 
(mean) average is that value. When the standard deviations of 
the columns are equal, the regression system of y on x is called a 
homoscedastic system. In a universe where they are not equal the 
system is said to be heteroscedastic. For a homoscedastic system 
with linear regression, S y = <r y ( 1 — r 2 ) 1/2 is the standard deviation 
of each array of y\ s. 

20. Reliability of Predictions. In using a regression equation to 
make predictions we are naturally interested in the degree of con¬ 
fidence to be expected in the predictions thus made. The use of S y 
in this connection is based upon the properties of the normal cor¬ 
relation surface. 

Let us imagine the universe of which Table 36 is a sample and 
assume that it may be described by a normal surface. Confining 
our attention to a section parallel to the t/z-plane in Figure 39 we 
know that an x array of y’s is distributed normally about a value of 
y determined by a designated value of x in the regression equation 
of y on x . That is, the mean of this normal distribution is the 




Correlation Theory 


197 


predicted value of y and its standard deviation is S y . The per¬ 
centage distribution of such an array is the same as that given in 
Figure 23 of Chapter VI, if S v be taken as the unit of measurement 
along the horizontal axis. But an estimate of S v is its value 
calculated from the sample. Moreover, for an observed distribution, 
we have seen that S y is the average standard deviation of the several 
columns and therefore it may reasonably be taken as an approxi¬ 


mation to the theoretical S y 
which in the universe is the 
same for all the columns. We 
also take the calculated regres¬ 
sion equation as an approxima¬ 
tion to the theoretical. 

By measuring deviations from 
the predicted value in terms of 
S y in the same way that <r is 
used as a unit in measuring de¬ 
viations from the mean, we may 
then enter a normal probability 
scale for the probability of a 
deviation involving multiples of 
S v . According to this scale the 
probability P v is about .68 for 
a deviation of dtS y from the 



Fig. 42 — Representing an x Array of 
y' s and Deviations of ±S v from a 
Predicted Value of y 


predicted value, and the chances are even for a deviation of .6745 aS v 
on either side of the predicted value. 


For Table 36 we have found S y = 15.83 and for an applicant 
making x = 32.5 on the mental test we have predicted y = 78.96. 
Therefore the chances are about 68 in 100 that his percentage of 
productivity will be between 78.96 — 15.83 and 78.96 + 15.83, that 
is, between 63.13 and 94.79. In other words, the probability is 
about .68 that the predicted value will not be in error by more than 


15.83. 


To summarize, in a normal bivariate universe each array is a 
normal distribution and therefore its mean coincides with its mode 
which is its most probable value. Since regression is linear, a value 
predicted from the regression equation of y on x is the most probable 

value of y for a designated value of x. Then, P y = J <p(t) dt is the 

probability for a deviation from the predicted value y x as small as |f| 



198 Mathematics of Statistics 


where t is expressed in units of the standard error S y of a column. 
Thus, 

, _ y - Vx _ 

Sy 


Then 1 — P y is the probability for a deviation as large as | 

larly, when dealing with the regression line of x on y, P x = J' ( i 

the probability for a deviation from the predicted value x y as small 
as |f|, where now t = (x — x„)/S*. 


Simi- 

t 

is 

f 


Exercises 

1. Refer to problem 4, §4. What is the probability that Doe will be less than 

65.75 inches or more than 77.75 inches? What are the chances that Roe 
will be between 100.8 and 122.4 pounds in weight? 

A ns. 1 - Pj, = .0027, P x - .5 (approximately). 

2 . Discuss the reliability of the predictions which you made in Exercise 3, §16. 

Outline of Solution. Suppose a reliability level of P y = .5 is desired. Mak¬ 
ing the necessary assumptions, this allows a deviation of t — ±.6745. 
Since S y = 15.83 we have 

d 


where d — y — yx- That is, y — y% i ? Por x — 37.5, y — ? ± ? 
So the probability is .5 that the standard of production will be between 
what limits for a person making x — 37.5 on the mental test? The 
problem is analogous for any other designated value of P v and for other 
assigned values of x. 



21. Non-Linear Regression. Correlation Ratio. We have seen 
that the regression systems of a normal correlation surface are linear. 
In a correlation table which is a representative sample from a normal 
bivariate universe the means of the arrays would lie approximately 
on straight lines. But in correlation tables which are samples of 
other types of universes, regression might not be linear. Moreover, 
one of the regression curves might be strictly linear and the other 
non-linear. The following numerical example illustrates the latter 
possibility. 



Correlation Theory 


199 


x 

0 

1 

2 

3 

fiy) 

2 

0 

0 

0 

2 

2 

1 

0 

1 

1 

2 

4 

0 

1 

1 

0 

0 

2 

fix) 

1 

2 

1 

4 

8 


In this example, the regression of y on x is linear whereas that of x on 
y is non-linear. 

When the means of the columns (or of the rows) do not lie approx¬ 
imately on a straight line, the use of r may be misleading because 
r = 0 indicates absence of linear correlation only and not necessarily 
absence of correlation in general. 

One of the best treatments of this situation is that given in the 
Cams Monograph on Mathematical Statistics , which will be repro¬ 
duced substantially here. 

In introducing a correlation ratio , rj yx , (eta) of y on x as an appropriate measure 
of correlation to take the place of the correlation coefficient in such a situation, 
we may get suggestions as to what is appropriate by solving for r in (10). This 
gives 



where we may recall that S y 2 is the mean square of deviations from the line of 
regression. Then 



This formula could be used appropriately as a definition of r in place of our 
definition in (1), and its examination may throw further light on the significance 
of r. When S y = 0, the formula gives r = ±1 and, as we have seen earlier, 
all the dots of the scatter diagram must then fall exactly on the line of regression. 
When S v — <r y , the formula gives r = 0, and the regression line is in this case of 
no aid in predicting the value of y from assigned values of x. In the formula 
r 2 = 1 — S v 2 /o-y 2 it is important to keep in mind that the mean square deviation 
Sy 2 is from the line of regression. Next, let S y ' 2 be the corresponding mean 
square of deviations from the means of columns. Then S y /2 = S y 2 when the 
regression is strictly linear, but S y ' 2 ^ S y 2 when the regression is non-linear. 
This fact suggests the use of a formula closely related to [1 — S y 2 /<r y 2 ] lf2 for a 




200 Mathematics of Statistics 


measure of non-linear regression by replacing S y by S y f . We then write 


( 38 ) 


V 2 


= 1 - 


C / 2 


where r) yx is the correlation ratio of y on x, and S y ' 2 is the mean square of devia¬ 
tions from the means of the columns whether these means are near to or far 
from the line of regression. 

In general, we may say that the correlation ratio of y on x is a measure of the 
clustering of dots about the means of columns. 

An analogous discussion for the rows obviously leads to 

W = 1- 2 

O’* 


giving rj xy 2 , the square of the correlation ratio of x on y. 

That i\ yx 2 ^ 1 and that the equality holds only when all the dots in each 
column are at the mean of the column follows at once from (38). 

That r) yx 2 ^ r 2 may be shown by recalling the meanings of S y 2 in (37) and 
of S y ' 2 in (38). A mean square of deviations in each column is a minimum when 
the deviations are taken from the mean of the array. Hence, the S y ' 2 in (38) 
must be equal to or less than S y 2 in (37) for the same data, since the deviations 
in (37) are measured from the line of regression. Hence, we have shown that 

1 ^ ^ T 2 . 

Moreover, when the regression of y on x is linear, rj yx 2 — r 2 found from the sample 
differs from zero by an amount not greater than the fluctuations due to random 
sampling. Hence, y yx 2 — r 2 becomes a criterion for testing the linearity of 
the regression of y on x. 

For computational purposes, it is desirable to express the correlation ratios 
in a form involving the standard deviations of the means of arrays. For this 
purpose, let y x be the mean of any column of y* s and <ry x the standard deviation 
of the means of columns when the square ( y x — y) 2 of each deviation is weighted 
with the number /(x) in the column. Then it follows that 


( 39 ) 


'Hi /* 2 



«Vx 

*v 


That is, the correlation ratio of y on x is the ratio of the standard deviation of 
the means of columns to the standard deviation of all y f s. 1 


To prove (39) we must show that <r y 2 — S y ' 2 — <s y 2 . We begin 
by observing that the concentration of the dots in a column about 
their mean may be measured in terms of their standard deviation. 
Let (T y .x denote the standard deviation of the y ’s in the column at x. 
That is, 

(40) Oyx’ = ttt JL(y - y*Yf(x, v )• 

J \ x ) y 

1 Rietz, Carus Monograph on Mathematical Statistics, p. 89 et seq. 



Correlation Theory 


201 


Now, the concentration of the dots in the entire table about the 
means of the columns may be measured by finding the mean value 
of all such expressions g x . v 2 for all the columns of the table. But 
since there are more points in some columns than in others, it will be 
desirable to weight the g v . x 2 for each column by multiplying it by 
the number of points or dots in the column. It is this weighted 
mean value of the a y . x 2 ’s which we have denoted by S v ' 2 . That is, 

(41) Sy 2 = 

In order to verify (39) we must now show that 

G 2 = Sy ' 2 + Gy 2 ' 

Adapting (14) of §9, Chapter V, to the notation of this chapter, 
we have 

(42) Na v 2 = + J^f(x)(y x - y) 2 . 

X X 

This follows from the fact that N is composed of the several sub¬ 
distributions f(x) in the columns, and g v . x is the standard deviation 
of a column about its mean y x . It is obvious that 

jflLf(x)(yz - yY 

gives the variance <r y 2 of the means of the columns. The above 
expression (42) then becomes 

Nay 2 = NSy ' 2 + Nay 2 , 

which reduces to a y 2 — S y ' 2 = G yx 2 , and hence to (39). 
i 22. Computation of if. It should be instructive to compute 
rj yx 2 for Table 36, by both relations (38) and (39). 

For (38) we have the following: 

VvY = 1 - ^7 > -V 2 = ~12f(x)<r v . x 2 , 

Gy 1 N X 



202 


Mathematics of Statistics 


(Ty. X 2 

m 

106.12 1 

7 

191.83 

14 

246.48 

32 

283.63 

49 

257.65 

55 

294.51 

54 

222.53 

35 

71.43 

14 

S v ' 2 = 246.45 



<s 2 = (17.41) 2 = 303.11 
2 _ 246.45 

Vvx ~ 1 " 303.11 
= .1869. 

For (39) we have the following: 

Vyz 2 = ~ - VYfi*) » 

<Ty xV a: 

y = 87.31. 


V* 

fix) 

67.86 

7 

72.14 

14 

81.87 

32 

84.80 

49 

85.73 

55 

90.92 

54 

95.57 

35 

105.00 

14 


ffs** = 56.66 (See Exercise 3, p. 93). 

56.66 
Vvx ~ 303.11 
= .1869. 


In verifying (39) for this example we have <r y 2 — S y 2 — 303.11 — 
246.45 - 56.66 and <r yx 2 = 56.66. 

The above illustrations are useful in giving an understanding of 
the meaning of rj yx 2 . However, for computational purposes, another 


1 See Table 27 and Table 16. 




Correlation Theory 


203 


formula may be derived which involves less labor than either (38) 
or (39). In fact, the computation of a correlation ratio may be very 
conveniently performed by an easy extension of a correlation table. 
The derivation of the appropriate formula will now be given. 

The standard deviation (o-y x ) of the means of the columns may be 
expressed in the (u, v ) units by the relation <jy 2 = & 2 o* tt 2 

where <?v u 2 = f(u)vj — v 2 

u 


which is the definition of the standard deviation of the variable v u . 
This is apparent if we observe that the mean for the whole table in 
the ^-direction (v) is the mean of the quantities v u for the several 
columns. 1 


Since 


we have 




1 ^ v 2 

a- 2 _ — V - £2. 

** N u f(u) 


Recalling that <r y 2 = Jc 2 cr v 2 , we have 


2 V Vx 

Vyx — 2 

<r y 2 




that is, 

(43) 


&V* 2 

V 2 


w ”**{*?/(«) 


pj- 


An analogous discussion for the rows of x’s leads to 

(44) 

giving the square of the correlation ratio of x on y. 

-^Ef(u)v u = jfLfM —22v/( u, v) = v) = -i X>/(u, V ) = V. 

N u N u J\ U ) V M U V N U, V 



204 


Mathematics of Statistics 


Example. Find Tjyx 2 for Table 35. Solution: Referring to this table and 
using (43) we obtain the following results: 


V 2 

49 

9 

9 

49 

900 

121 

169 

Sum 

V*/f(u) 

3.78 

.75 

.47 

2.45 

37.50 

20.17 

28.17 

93.29 


v 2 = .2916, a v 2 = 1.8084, N = 100. 

"■•■-rsi[s5 <93 ' 29) -H 

Vvx 2 == .3546. 


It may be well to mention that the value of 17 is not independent 
of the classification of the data. As the class intervals become 
narrower, tj approaches unity. This may be understood from (38). 
If the grouping were so fine that only one item appeared in each 
column, then it would constitute the mean of that column. In this 
case Sy would be zero and r? would therefore be unity. On the other 
hand, a very coarse grouping tends to make the value of rj approach r. 
“ Student ” has given a formula for The Correction to he Made in the 
Correlation Ratio for Grouping in Biometrika, vol. IX, pp. 316-320. 

23. Further Discussion. Test for Linearity of Regression. Let 
us consider the totality of mean points (x, y x ) of the columns and 
think of a curve connecting them. Of course, for a table of observed 
data, it is possible to draw many such curves. In order to show 
clearly why a comparison of rj 2 and r 2 is the basis of a test for linearity 
of regression, it will be necessary to consider a theoretical table in 
which there is only one such curve. When we speak of the regression 
curve we are thinking, not of the given table in which the dimensions 
of the cells are h and k> but of an ideal table in which there is an 
infinity of cells of zero dimensions. To put it another way, consider 
a sample of N pairs of values (x if yt) from which a correlation table 
is made with cells whose dimensions are h and k. If parallelopipeds 
are erected on the cells with heights proportional to the frequencies, 
the result is a solid histogram bounded by a broken surface. As 
h —» 0 , k — > 0 , and N —> <*> , this histogram will approach some solid, 
bounded by a smooth surface. An example of such a surface is the 
normal correlation surface. In such an ideal table, it is possible to 
have but one curve connecting the means of the columns. This 
curve is sometimes styled the true regression curve of y on x. In an 
analogous way for the means of the rows there would be a true 


Correlation Theory 


205 


regression curve of x on y. It is one of these curves that we have in 
mind when we speak of “ the regression curve ” or u the regression.” 
For a normal bivariate universe (represented by a normal correlation 
surface), regression is linear. But for other types of bivariate 
universes (which might be represented by skew surfaces), it is 
conceivable that regression might be parabolic or exponential or 
some other type of curve. In such types, regression is said to be 
non-linear. The curve which is chosen to 
approximate the true regression curve must 
not be confused with the true regression 
curve. The latter notion relates to the 
ideal universe from which the data at hand 
are a sample. It is defined as the locus 
of the mean points of the columns of the 
theoretical table. When we fit a curve to 
the means of the columns of an observed 
table, this regression curve is merely an 
approximation to the ideal set up in the definition. Similar state¬ 
ments may be made about the regression of x on y. 

We will now recapitulate the expressions used in the comparative 
analysis of r 2 and r \ Vx 2 for an observed table. 



(45) 


(46) 


<r v . x 2 = - oyx) 2 f(x, y) 

J\ x ) y 

Sy ' 2 = 

<2 '2 2 
„ 2 = I — =^£- 

Vvx <r v 2 

S„.* 2 = - oVxYKx, y) 

J\ X ) V 

Sy 2 = ^rllsvx 2 f(x) 

fy x 


Recall that a v . x 2 is defined as the variance in a column and therefore 
as the square of the standard error about the regression curve, what- 




206 


Mathematics of Statistics 


ever it may be, which goes through the means of the columns. S y 2 
is an average of the <r y . x 2 values, and rj yx 2 is defined in terms of S y . 
Correspondingly, s y . x 2 is the square of the standard error in a column 
about the line which best fits the means of the columns. S y 2 is an 
average of the s y . x 2 values, and r 2 is defined in terms of S y . If 
regression is linear, the means of the columns will fall on the “ best¬ 
fitting line ” and <r y . x 2 becomes the same as s y . x 2 . Then S y 2 = S y 2 , 
and hence rj yx 2 = r 2 . 

It is interesting to observe that o y . x 2 is the second moment about 
the mean, for an array of y’s, i.e. y for a column. In the notation 
of moments it could be denoted by y 2 :y .x. In this notation, s y . x 2 
could be denoted by v 2:v . xy being the second moment in an array 
of y’s about a point other than its mean. Since y 2 ^ v 2) it follows 
that (x y . x 2 < s y . x 2 . Therefore S y 2 ^ S y 2 and rj yx 2 ^ r 2 . If each y 
value of a column is at the mean of that column then it is 
obvious that cr y 2 will be zero. In this case, S y = 0 and t\ yx 2 — 1. 
On the other hand, since the dispersion in a column cannot be 
greater than the dispersion a y of the whole table, it follows that 
< T y . x 2 < <r y 2 and hence that S y 2 ^ <r v 2 . 

Therefore, rj yx 2 ^ 1. 

Writing (38) in the form 

S y ' = (t y (1 -v,* 2 ) 112 

we see that S y is a measure of dispersion about the regression curve 
(which is the locus of the means) corresponding to S y = <^(1 — r 2 ) 112 
which is the standard error about the “ best ” line . If r 2 = 1, then 
y is related to x by a linear function. If rj yx 2 — 1, it follows that y 
is a single-valued function of x. On the other hand, if r 2 = 0, it does 
not necessarily follow that there is no relation 1 between y and x . If 
T) yx 2 = 0 then r 2 = 0, but if r 2 = 0 it does not necessarily follow that 
y V z 2 = 0 . 

In the ideal table, regression of y on x is linear if and only if 
Tj yx 2 — r 2 — 0. But in the case of an observed table, allowance must 
be made for sampling fluctuations. A corresponding analysis could 
be made for r 2 and y xy 2 y and t\ xy 2 — r 2 computed from the sample should 
differ from zero by an amount not greater than the fluctuations due 

1 See H. L. Rietz, On Functional Relations for which the Coefficient of Correlation 
is zero . Journal American Statistical Association, vol. 16, 1919, pp. 472-76. 



Correlation Theory 


207 


to chance, if regression of x on y is linear. The question naturally 
arises, what discrepancy between the computed values of rj 2 and r 2 
may be tolerated before we conclude that regression is non-linear? 
This problem has been investigated, and Blakeman 1 has proposed a 
testing formula. If certain assumptions are made, a simple though 
approximate test may be deduced from Blakeman’s formula. Ac¬ 
cording to this approximate test if 

(47) N{v % - r 2 ) < 11.4 

then linear regression may be assumed. Since there are two rj 2 there 
are two tests. It is possible for one of the regression curves to be 
linear and the other not. 

Evaluating (47) for Table 35 we obtain 100 [.3546 — (.58) 2 ] — 1.82, 
so the regression of y on x may be assumed to be linear. 

R. A. Fisher has shown that the Blakeman test is not very reliable. 
One can easily construct an example for which regression is obviously 
non-linear yet which satisfies the criterion (47). Consider the fol¬ 
lowing table: 


z 

y 

1 

2 

3 

4 

5 

3 

0 

0 

1 

0 

0 

2 

0 

1 

0 

1 

0 

1 

1 

0 

0 

0 

1 


Here, N = 5, X xy = 27, x = 3, y = 9/5. From (3), therefore, 
r = 0. From (40) and (41), S v f = 0 and rj yx = 1 . Applying (47), 
Blakeman’s test yields a verdict of linear regression of y on x . It 
appears that Blakeman’s criterion is of doubtful utility. A more 
efficient method of testing linearity of regression is given in Part II. 


Exercises 

1. Using (43) and (44) find rj yx 2 and i) xy 2 for the table referred to in Exercise 4, 
page 192. Apply the test (47) and state your opinion about the linearity 
of regressions. 


1 See Handbook of Mathematical Statistics , Rietz and others, p. 131. 




208 


Mathematics of Statistics 


2. In the following table, x = Interest Rates, 4-6 months Commercial Paper, 
y = Total Bills Discounted by Federal Reserve Banks, (1923-1932). Find 
r and rjyx 2 . Form an opinion about linearity of regression of y on x. (Data 
from Elements of Statistics , Davis and Nelson, p. 288.) 


Class 

Marks 

y 











7 







1 

6 

6 

6 

6 





1 

2 

3 

4 



5 





1 

3 

1 

2 


■ 

4 




2 


9 

4 

1 


| 

3 


1 

2 

1 

4 

9 

4 



E 

2 


1 



11 

5 

1 




1 

4 


2 

3 

3 

1 





0 

2 

3 

3 

5 

3 






Claes 

Marks 

X 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 


3. On page 218, Statistical Methods for Research Workers , 3rd ed., London, 

1930, R. A. Fisher writes: “ The sum of the squares of the deviations of 
all the values of y from their general mean may be broken up into two 
parts, one representing the sum of the squares of the deviations of the 
means of the arrays from the general mean, each multiplied by the number 
in the array, while the second is the sum of the squares of the deviations 
of each observation from the mean of the array in which it occurs.” 
(Compare with our (14a) of Chapter V.) 

Prove Fisher's statement. Hint. In symbols, you are to prove that 

V = Vi + V2 

where 

V = Z(y - mix, y ) 

x t y 

vi = Y.(y* - y) 2 f(x) 

x 

vz = 'Z2(y ~ yxff(x, y ). 

x, y 

4. Prove that y V x 2 is the ratio between Vi and V as defined in Exercise 3. 




Correlation Theory 


209 


5* The mortality experience during the early years of an insurance company 
presents an interesting study in correlation. The following table shows 
for male lives the correlation between the age ( x ) of the insured at issue 
of policy and his age (y) at death. Data of the Midland Life Insurance 
Company, 1 1906-1924. 



Find r, the two ^ s, and the equations of the lines of regression. 

24. Correlation from Ranks. Before defining rank we will find the 
variance of the difference, z , between corresponding values of two 
variables. Let x and y denote corresponding values of two series each 
consisting of N variates. Form a third series z where 2* = — y». 

Then the mean of z is given by z = x — y and the standard devia¬ 
tion of 2 is, by definition, 

= ;^ 5 > - 2 a - 

1 From a paper On Certain Applications of Mathematical Statistics to Actuarial 
Data in The Record, American Institute of Actuaries, vol. XIII, Part II, 
No. 28, November, 1924. 














210 


Mathematics of Statistics 


Replacing z by its equal x — y, this becomes 


<r* 2 = -I> 2 — 2 xy + y 2 ) — x 2 — y 2 + 2 xy 


= i 


2 + {^Ij/ 2 - 5 2 }‘ 


Whence 

(48) <r, 2 = <r* 2 — 2r<r x a y + o-j, 2 . 


If the variables x and y are uncorrelated, we have as a special case 


(49) 


< 7* 2 = <J X 2 + (Ty 2 . 

Solving (48) for r, we obtain 

<r x 2 + <r y 2 ~ <?z 2 


r = 


2 cr i/ 


This is another expression for the correlation coefficient and involves 
standard deviations only. In particular, it may be used to advantage 
when x and y denote ranks, where by rank we mean order of magni¬ 
tude or importance. That is, rank refers to the position of a variate 
in an arrangement. 

If x and y denote the ranks of the same item with respect to two 
characteristics, and no ranks are omitted, and there are no duplica¬ 
tions of ranks, then both x and y refer to the integers from 1 to N. 

Therefore, x — y, and <r x 2 = — ( N 2 — 1) = <r y 2 . See Theorem VI, 

Chapter V. Moreover, 

<r. 2 = ^S> 2 -z 2 

= _ y ) 2 - ~ y ) 2 

= - y) 2 , since x - y = 0. 

N 


When x and y denote ranks rather than the variates themselves, it is 



Correlation Theory 211 


customary to represent the correlation by p. Therefore, (49) becomes 



which simplifies into 
(50) 


6 £(*-y) 2 . 
N(N* - 1) 


If two or more variates are tied it is customary to divide the 
corresponding rank-numbers among the variates concerned, using 
fractions if necessary. 


Example . Suppose we have the following scores made in two tests, arranged 
in the order of their rank. Find the correlation between ranks. 


Indi¬ 

vidual 

1st Subject 

2nd Subject 

x - y 

(x - y) 2 

Score 

Rank = x 

Score 

Rank = y 

A 

92 

1 

85 

2 

-1 

1 

B 

86 

<*» 

& 

76 

4 

“2 

4 

C 

84 

3 

H 

1 

2 

4 

D 


4 

EM 

6 ! 

—2 

4 

E 

71 

5 

67 

7 

-2 

4 

F 

69 

6 

83 

3 

3 

9 

G 

66 

7 

54 

9 

1 

4 

H 

58 

8 

70 

5 

n 

9 

I 

53 

9 

43 

10 

-i 

1 

J 

45 

10 

59 

8 

2 

4 

N = 10 





Total 

44 


We find p = 1 - 


6(44) 

10(99) 


.733. 























212 


Mathematics of Statistics 


Exercise 


Find p for the following data: 



Rank 

Score 

Rank 

Score 

A 

1 

92 

2 

88 

B 

2 

89 

4 

85 

C 

3 

87 

1 

93 

D 

4 

86 

6 

79 

E 

5 

83 

7 

70 

F 

6 

77 

3 

87 

G 

7 

71 

9 

52 

H 

8 

62 

5 

84 

I 

9 

53 

10 

41 

J 

10 

40 

Ans. p — .733. 

8 

64 


25. Comparison of p with r. It is interesting to compare the 
value of p in the example of the preceding section with the value of r 
between the corresponding scores. Using formula (1) we find 
r = .729. Results for the two methods are not always so close. 
For the scores in the preceding exercise we find r = .636. We observe 
that in this exercise the ranks remained the same as in the illus¬ 
trative example, although the scores themselves were changed. 
Indeed it is possible to change the variates themselves very much 
without changing the ranks. The nature of the distribution of the 
variates is a fundamental consideration in the method of correlation 
from ranks. 

There are formulas and tables expressing the relation between 
p and r, but they depend upon an assumption as to the distribution 
of the variates. The relation 


- 2 5in (i') 


holds only for a normal distribution. It is interesting to compare 
the scatter diagram and marginal totals for each of the distributions 
given in the preceding example and exercise. 

The marginal totals show considerable symmetry in the example 
but none in the exercise. 

The student should remember that the value of either r or p is 
only an estimate of the true but unknown value in the population. 
We may say by way of summary that p indicates the presence of 
correlation rather than its extent. 



Correlation Theory 


213 


EXAMPLE EXERCISE 


\fle 

y \ 

45 

55 

65 

75 

85 

95 

N 

95 







•: 

85 



* 

• 

. 

• 

75 





• 


65 







55 




. 



45 


. 









\x 

y \ 

45 

55 

65 

75 

85 

95 

N 

95 







: 

85 



. 




75 


. 





65 




• 



55 







45 


. 






• .. 




Exercises 

1. Suppose z — x + y. How would this change formulas (48) and (49)? 

2. Twelve salesmen are ranked in order of merit for efficiency by their manager. 

They are also ranked in accordance with their length of service. What 
indication is there of a relation between length of service and efficiency? 


(Garrett.) 

Salesmen 

Years of 

Order of Merit 

Order of 


Service 

(Service) 

Merit 

A 

5 

7.5 

(Effic.) 

6 

B 

2 

11.5 

12 

C 

10 

2 

1 

D 

8 

4 

9 

E 

6 

6 

8 

F 

4 

9 

5 

G 

12 

1 

2 

H 

2 

11.5 

10 

I 

7 

5 

3 

J 

5 

7.5 

7 

K 

9 

3 

4 

L 

3 

10 

11 


The fractions in the third column denote ties in rank. Thus, A and J each 
served five years and each is ranked 7.5. The next individual is ranked 9. 
Arts, p = .80. 


26. Interpretation. Common Elements. Although statistical 
theory gives a description of the indicated relationship between two 
related variables, the interpretation of the results “ abounds in pitfalls 
easily overlooked by the unwary, while they are cantering gaily 
along upon their arithmetic.” 

The methodological side has been developed until we can find correlation coeffi¬ 
cients by simply turning a crank, but the explanation of the meaning of the result 




214 


Mathematics of Statistics 


after we find it, needs a brain. ... No amount of mathematical training and 
ability can take the place of the judgment and common sense that comes from 
a knowledge of the field in which the problem lies. 1 

In the interpretation of r one should avoid imputing any causal 
relationship between the variables. In this connection the following 
pungent remarks of Professor E. B. Wilson 2 may be appropriately 
quoted: 

Correlation is a mutual affair between two numerical variables; the correlation 
coefficient r is symmetrical with respect to them. Strictly, y is not correlated 
with x or x with y, but x and y are correlated. Theory is very important in 
indicating what facts should be looked for as significant; facts are significant 
or important largely as they indicate theory, but neither compels the other, as 
the histories of theorizing and of fact finding amply demonstrate . . . Further, 
the value of the correlation coefficient depends on the group for which it is deter¬ 
mined or on the universe of which that group is a fair sample. The correlation 
coefficient r of height and weight for a group containing humans from infancy to 
adult life would be different from, and in fact greater than, the coefficient for 
college students or for the members of a football squad; there is no such thing 
as the correlation coefficient per se. 

If the student has mastered the underlying mathematical theory 
he should be able to understand and profit by the interpretations 
given by the writers in his particular field of interest. As a final 
aid in forming a conception of its meaning, we state a theorem which 
gives to r a meaning in pure chance. If x and y are affected by s 
equally likely causes of which t are common to both, then r = </s. 

Theorem VI. An urn containing white and black balls is so main¬ 
tained that in drawing a ball the probability of getting a white ball is a 
constant p and that of getting a black ball is q (= 1 — p). The first 
drawing of a pair of drawings is to consist of s balls taken one at a time 
from the urn. The second drawing is to consist of s balls of which t are 
taken at random from the s first drawn, ands-t are drawn one at a time 
from the urn. Then the correlation coefficient between the numbers of 
white balls in the two drawings is t/s. 

As an illustration of the theorem we will take s = 5, t = 3, p = £. 
Let x be the number of white balls in the first drawing and y the 
number of white balls in the second drawing. Then Table 37, 

1 Crathorne in Journal of the American Statistical Association, vol. 26, 

Supplement, March, 1931, p. 27. . 

2 Correlation and Association , Journal of the American Statistical Association, 

vol. 26 (1931), pp. 250-256. 



Correlation Theory 215 

constructed by the theory of probability, 1 exhibits the a priori fre¬ 
quencies when we use as small numbers as possible for frequencies 
subject to the condition that each frequency is to be an integer. 


Table 37 — A priori Frequencies 



0 

1 

2 

3 

4 

5 

fiv) 

5 

0 


0 

9 

6 

1 

16 

4 

0 

0 

81 

108 

45 

6 

240 

3 

0 

243 

648 

432 

108 

9 

1440 

2 

243 

1620 

1728 

648 

81 

0 

4320 

1 

1458 

3159 

1620 

243 

0 

0 

6480 

0 

2187 

1458 

243 

0 

0 

0 

3888 

fix) 

3888 

6480 

4320 

1440 

240 

16 

16,384 


According to the theorem the correlation coefficient should be 
It is left as an exercise for the student to show, by computing r from 
the table, that this is actually the case. 


1 Explained in Part II. 














Review Questions and Problems 

1. Define the following terms: statistics, variate, discrete, class interval, class 

mark, x-array of y’s, range, regression line, sample, universe, coefficient of 
variation. 

2. Define the five measures of central tendency. Discuss their advantages and 

limitations. 

3 . What does a ratio chart show that a chart with a uniform scale does not? If 

you wished to plot data so as to secure the effect of a ratio chart, but had 
no ratio paper available, how would you accomplish the desired result? 

4 . Prove the following: 

(a) The algebraic sum of the deviations of the variates from their mean 
is zero. 

(6) The second moment about an arbitrary point equals the second mo¬ 
ment about the mean increased by the square of the distance between 
the arbitrary point and the mean. 

5* (a) Define and explain how to compute the following: 

Qii ^ 2 ) Qf MD, s, <t . 

(fr) In the case of a normal distribution give the value of each of the first 
four constants in (a) in terms of x or a. 

6. (a) Give the equation of the normal curve in both arbitrary coordinates and 
standard units. State the relation between abscissas and between ordi¬ 
nates in the two systems. 

(6) State the properties of the normal curve. 

. 7 . Show how to fit a straight line y = rax + k by the method of moments by 
deriving the expressions for m and k. 

8. Show how to fit an exponential function by the method explained in the- 

text. 

9. Show how to fit a parabola by the method of moments. 

10 . (a) Give two of the formulas for r. Discuss the use or uses of correlation in 

any problem that occurs to you. 

(b) Show that the slope of the line in Question 7 may be .written r<r y /<r x . 

11 . (a) Prove that \r\ ^ 1 . 

(6) Define the correlation ratio. Discuss its use. 

12. Discuss rank correlation. 

13 . Derive the following relations : 

x = cu + Xo 

fJL 2 — V2 — Vi 2 

M2:* = C 2 M2:« 

o 'x = ~ Co\*. 

216 



Review Problems 


217 


14 . The following is a reduced distribution of the breakfast checks at a cafeteria. 
Using the indirect method find x and a x . 


X 

/ 

8-12 

4 

13-17 

. 8 

18-22 

24 

23-27 

21 

28-32 

15 

33-37 

14 

38-42 

7 

43-47 

4 

48-52 

2 

53-57 

1 

A ns. x = 27.2j£, <r = 9.4^. 

Derive the relations which give the third and fourth moments about the 

mean in terms of moments about an 

arbitrary origin. Define a 3 and « 4 . 

What information do they give? 

Compute the value of a 3 and of a 4 for the distribution in Exercise 14. 

The following is a distribution of the heights of students where x denotes 

heights in inches and / is the number of students of the corresponding 

height. Find x, <r z , a 3 , and a 4 . 

X 

/ 

60.5 

1 

62.0 

3 

63.5 

14 

65.0 

32 

66.5 

61 

68.0 

80 

69.5 

71 

71.0 

35 

72.5 

24 

74.0 

2 

75.5 

1 


18. For N values of a variable v it is known that X^ = 0 and X^ 2 = #. What 

is the origin and unit of v? 

19 . Find in two ways the value of P for which the function 

y = - PY 

has the smallest value. 

20 . (Walker.) An algebra test was given to 400 high school children, of whom 

150 were boys and 250 were girls. The results were as follows: 

7 ii = 150 n 2 = 250 

x\ — 72.5 x 2 = 73.6 

<Ti = 7.0 <72 = 6.4 

Find the mean and standard deviation of the combined groups. 



218 


Mathematics of Statistics 


21 . For a normal distribution of 1500 students’ grades, $ = 75, cr x ~ 10. What 

values of x will include the middle 500 grades? How many grades were 
below 60; above 90? 

22 . Suppose a distribution of 1000 breakfast checks from the cafeteria mentioned 

in problem 14 showed the following results: $ = 27^, <r* = 9j£, a 8 = 0, 
04 = 3. On the basis of these results what is the expected frequency in 
the 23—27^ class interval? 

23 . Given the following data as to the heights ( y ) and weights ( x ) of college men: 

Ev = 6,800, Ev 2 = 463,025, E*V - 1,022,250 

Ex = 15,000, Ex 2 = 2,272,500, N = 100. 

Find y , <r x , <r v , r. 

24 . Derive the expression for the standard error of estimate, 

S y - <r v (l ~ r*)«K 

25 . Discuss the use of S v in predictions. 

26 . Compute the median, quartiles, and quartile deviation for the following dis¬ 

tribution where x = bushels per acre and / = corresponding frequency. 


X 

/ 

1 

3 

3 

26 

5 

78 

7 

107 

9 

113 

11 

65 

13 

40 

15 

22 

17 

45 

19 

41 

21 

21 

23 

23 


27 . (a) Find r for the following table using ( u } v) coordinates. 




19 

21 

23 

f(y) 

18 


B 

2 

1 

6 

15 



3 


10 

12 


B 

1 

| 

4 

fix) 

4 

8 

6 




(b) For the above data, find x> y, a x , <r v , and the equations of the regres¬ 
sion lines. 




















DES MOINES 


Review Problems 


219 


28. For Table 38, (a) find the correlation coefficient, ( b ) find the equations of the 
lines of regression, (c) locate the coordinate axes through the arithmetic 
mean of the table and plot the lines obtained in (6). 

Table 38 — Correlation Table for Monthly Rainfall at Iowa City and 
Des Moines, 1890-1925 

IOWA CITY 


2/\. 

1 

© 

a 

o 

iO 

a 

S 

S 

*0 

a 

d 

a 

d 

a 

CO 

a 

CO 

d 

«o 

a 

d 

»o 

a 

d 

2 

S 

d 

a 

d 

a 

d 

iO 

a 

d 

iO 

a 

d 

1 

o6 

8.745 

d 

9.745 ! 

| 10.245 

| 10.745 

/( V) 

10.245 



















1 




H 

9.745 
















1 







■ 

9.245 

















1 

1 





2 

8.745 













1 






1 




2 

8.245 













2 


1 

1 







4 

7.745 











1 











1 

2 

7.245 










2 

1 


1 


2 

1 







7 

6.745 

i 

■ 

2 


1 


2 


1 



1 



1 



1 

1 




10 

6.245 

| 











1 











1 


I 





4 

1 


1 




1 

2 



1 







BPS 

i 

■ 

1 


1 

2 



2 


2 

1 


1 

1 




1 




12 

4.745 





2 

1 

2 

2 

1 

1 

1 


1 



2 



1 




14 

4.245 



1 


1 

1 

2 


1 

4 

1 

2 

1 










14 

3.745 




4 

2 

1 

2 

6 

5 

3 

2 

2 

2 


1 









3.245 


2 

1 

4 

6 

6 

3 

7 

2 


1 






i; 




,1 


34 

2.745 



3 

4 

1 

8 

4 

4 

2 

1 

2 



1 









30 

2.245 



1 

5 

E 

7 

6 

4 

2 

2 













37 

1.745 

1 

4 

: 7 

12 

13 

8 

5 

1 

1 

1 

2 


1 










56 

1.245 

3 

8 

; 18 

17 

6 

8 

4 

2 


1 













67 

0.745 

6 

16 

i 21 

12 

6 

1 

1 

1 



1 



1 









66 

irazi 

13 

12 

! 3 

; 4 



















32 

fix) 

23 

42 

! 58 

> 62 

49 

47 

32 

27 

18 

15 

14 

7 

B 

5 

6 

5 

3 

2 

5 

0 

1 

1 

432 




































220 


Mathematics of Statistics 


29 . Fit an exponential function of the type y — Ae Bx to the following data: 


X 

0 

2 

4 

y 

2 

10 

100 


First find the equation in the forms 

(a) Y — at + b 

(b) Y = mx + k 

» and then determine A and B. 

30 . How does the scatter diagram assist one in deciding whether the regression is 

linear or non-linear? Give the formulas for the correlation coefficient 
and for the correlation ratio of y on x , explaining the meaning of the letters 
used. How would you use these indices of correlation to decide whether 
the regression of y on x is linear or non-linear? 

31 . (a) In a normal distribution in which 5 = 0 and <r x = 4, what proportion 

of the data lie where x > 12? 

(6) If 100 of the data lie between x = — 6 and x = — 8, how many of the data 
are there in the whole distribution? 

32 . (a) When the variates are ungrouped what is perhaps the best formula 

for d x ? Ans. 

[ivl> - (I >) 2 ] 1/2 


(6) What does this expression become in terms of N when x refers to the 
integers from 1 to N? 

33 . (a) Expand (a + b -f c + d) 2 . 

(b) Show that the expansion of (xi + x<l + • • • + #n) 2 consists of the 
sum of the squares of the x's plus twice the sum of their product taken 
two at a time. Express this expansion in summation notation. 

34 . (a) Show that the formula for MD may be written 

MD = fM]. 

N x { <x X i <x 

Hint. For Xi < 5, YLfi i & ~ %\ =—^fi(xi — x) = J2fi(x — Xi) = 

- Em-- 

For Xi > X, E/i \ x i — = — E/i( 5 ~ x i)- 

Since x is the centroid (§14, Chapter III), — E/»(^ “ x *) f° r x < > x equals 
E/dz — x %) for Xi < x. 

(b ) Using this formula evaluate MD for one of the distributions in the text. 

35 . Given N pairs of variates: (xn, x 2 i); ( x i2 , x 22 ); (x 12 , x 23 ); • • • ; ( xi nf x 2n ). 
Show that: 

(a) the mean x of all the variates is 

1 71 

x = ^Efan + x *i)> 




Review Problems 


221 


(6) the variance <r* taken about the x in (a) is 

<r s = ^ [S(zn — *) 2 + — ®) 5 ]- 

Note. The quantity 

1 n 

R = - x)(x 2i - x) 

Nor 2 1 

where x and <r 2 are defined as in (a) and (2>) is called the intro-class corre¬ 
lation coefficient. For its use see Statistical Methods for Research Workers , 
Fisher, (p. 179, 3rd Ed.), Oliver and Boyd, London. 

36. Let S r = —- £ x *• Prove that Si = N(N + l)/2, 

N x = i 

S 2 = N(N + 1)(2 N + l)/6, S 3 = Si 2 . 



APPENDIX 


Tables 

I. Ordinates and Areas op the Normal Curve. 

II. Common Logarithms op Numbers to Five Decimal Places. 



Table I. 


Ordinates and Areas of the Normal Curve, <f>(t) — 

V2ir 


t 



t 



t 


So‘<Kt)dt 

.00 

.39894 

.00000 

.45 

.36053 

.17364 

.90 

.26609 

.31594 

.01 

.39892 

.00399 

.46 

.35889 

.17724 

.91 

.26369 

.31859 

.02 

.39886 

.00798 

.47 

.35723 

.18082 

.92 

.26129 

.32121 

.03 

.39876 

.01197 

.48 

.35553 

.18439 

.93 

.25888 

.32381 

.04 

.39862 

.01595 

.49 

.35381 

.18793 

.94 

.25647 

.32639 

.05 

.39844 

.01994 

.50 

.35207 

.19146 

.95 

.25406 

.32894 

.06 

.39822 

.02392 

.51 

.35029 

.19497 

.96 

.25164 

.33147 

.07 

.39797 

.02790 

.52 

.34849 

.19847 

.97 

.24923 

.33398 

.08 

.39767 

.03188 

.53 

.34667 

.20194 

.98 

.24681 

.33646 

.09 

.39733 

.03586 

.54 

.34482 

.20540 

.99 

.24439 

.33891 

.10 

.39695 

.03983 

.55 

.34294 

.20884 

1.00 

.24197 

.34134 

.11 

.39654 

.04380 

.56 

.34105 

.21226 

1.01 

.23955 

.34375 

.12 

.39608 

.04776 

.57 

.33912 

.21566 

1.02 

.23713 

.34614 

.13 

.39559 

.05172 

.58 

.33718 

.21904 

1.03 

.23471 

.34850 

.14 

.39505 

.05567 

.59 

.33521 

.22240 

1.04 

.23230 

.35083 

.15 

.39448 

.05962 

.60 

.33322 

.22575 

1.05 

.22988 

.35314 

.16 

.39387 

.06356 

.61 

.33121 

.22907 

1.06 

.22747 

.35543 

.17 

.39322 

.06749 

.62 

.32918 

.23237 

1.07 

.22506 

.35769 

.18 

.39253 

.07142 

.63 

.32713 

.23565 

1.08 

.22265 

.35993 

.19 

.39181 

.07535 

.64 

.32506 

.23891 

1.09 

.22025 

.36214 

.20 

.39104 

.07926 

.65 

.32297 

.24215 

1.10 

.21785 

.36433 

.21 

.39024 

.08317 

.66 

.32086 

.24537 

1.11 

.21546 

.36650 

.22 

.38940 

.08706 

.67 

.31874 

.24857 

1.12 

.21307 

.36864 

.23 

.38853 

.09095 

.68 

.31659 

.25175 

1.13 

.21069 

.37076 

.24 

.38762 

.09483 

.69 

.31443 

.25490 

1.14 

.20831 

.37286 

.25 

.38667 

.09871 

.70 

.31225 

.25804 

1.15 

.20594 

.37493 

.26 

.38568 

.10257 

.71 

.31006 

.26115 

1.16 

.20357 

.37698 

.27 

.38466 

.10642 

.72 

.30785 

.26424 

1.17 

.20121 

.37900 

.28 

.38361 

.11026 

.73 

.30563 

.26730 

1.18 

.19886 

.38100 

.29 

.38251 

.11409 

.74 

.30339 

.27035 

1.19 

.19652 

.38298 

.30 

.38139 

.11791 

.75 

.30114 

.27337 

1.20 

.19419 : 

.38493 

.31 

.38023 

.12172 

.76 

.29887 

.27637 

1.21 

.19186 i 

.38686 

.32 

.37903 

.12552 

.77 

.29659 

.27935 

1.22 

.18954 

.38877 

.33 

.37780 

.12930 

.78 

.29431 

.28230 

1.23 

.18724 

.39065 

.34 

.37654 

.13307 

.79 

.29200 

.28524 

1.24 

.18494 

.39251 

.35 

.37524 

.13683 

.80 

.28969 

.28814 

1.25 

.18265 

.39435 

.36 

.37391 

.14058 

.81 

.28737 

.29103 

1.26 

.18037 

.39617 

.37 

.37255 

.14431 

.82 

.28504 

.29389 

1.27 

.17810 

.39796 

.38 

.37115 

.14803 

.83 

.28269 

.29673 

1.28 

.17585 

.39973 

.39 

.36973 

.15173 

.84 

.28034 

.29955 

1.29 

.17360 

.40147 

.40 

.36827 

.15542 

.85 

.27798 

.30234 

1.30 

. 17137 

.40320 

.41 

.36678 

.15910 

.86 

.27562 

.30511 

1.31 

.16915 

.40490 

.42 

.36526 

.16276 

.87 

.27324 

.30785 

1.32 

.16694 

.40658 

.43 

.36371 

.16640 

.88 

.27086 

.31057 

1.33 

.16474 

.40824 

.44 

.36213 

.17003 

.89 

.26848 

.31327 

1.34 

i 

.16256 

.40988 


225 



Table I. Ordinates and Areas of the Normal Curve, <f>(t) 



t 

«#>(<) 

Jq* <l>(t)dt 

t 



t 


/ 0 ‘<KO<a 

1.35 

. 16038 

.41149 

1.80 

.07895 

.46407 

2.25 

.03174 

.48778 

1.36 

.15822 

.41309 

1.81 

.07754 

.46485 

2.26 

.03103 

.48809 

1.37 

. 15608 

.41466 

1.82 

.07614 

.46562 

2.27 

.03034 

.48840 

1.38 

.15395 

.41621 

1.83 

.07477 

.46638 

2.28 

.02965 

.48870 

1.39 

.15183 

.41774 

1.84 

.07341 

.46712 

2.29 

.02898 

.48899 

1.40 

.14973 

.41924 

1.85 

.07206 

.46784 

2.30 

.02833 

.48928 

1.41 

.14764 

.42073 

1.86 

.07074 

.46856 

2.31 

.02768 

.48956 

1.42 

.14556 

.42220 

1.87 

.06943 

.46926 

2.32 

.02705 

.48983 

1.43 

.14350 

.42364 

1.88 

.06814 

.46995 

2.33 

.02643 

.49010 

1.44 

.14146 

.42507 

1.89 

.06687 

.47062 

2.34 

.02582 

.49036 

1.45 

.13943 

.42647 

1.90 

.06562 

.47128 

2.35 

.02522 

.49061 

1.46 

.13742 

.42786 

1.91 

.06439 

.47193 

2.36 

.02463 

.49086 

1.47 

.13542 

.42922 

1.92 

.06316 

.47257 

2.37 

.02406 

.49111 

1.48 

.13344 

.43056 

1.93 

.06195 

.47320 

2.38 

.02349 

.49134 

1.49 

.13147 

.43189 

1.94 

.06077 

.47381 

2.39 

.02294 

.49158 

1.50 

.12952 

.43319 

1.95 

.05959 

.47441 

2.40 

.02239 

.49180 

1.51 

.12758 

.43448 

1.96 

.05844 

.47500 

2.41 

.02186 

.49202 

1.52 

.12566 

.43574 

1.97 

.05730 

.47558 

2.42 

.02134 

.49224 

1.53 

.12376 

.43699 

1.98 

.05618 

.47615 

2.43 

.02083 

.49245 

1.54 

.12188 

.43822 

1.99 

.05508 

.47670 

2.44 

.02033 

.49266 

1.55 

.12001 

.43943 

2.00 

.05399 

.47725 

2.45 

.01984 

.49286 

1.56 

.11816 

.44062 

2.01 

.02592 

.47778 

2.46 

.01936 

.49305 

1.57 

.11632 

.44179 

2.02 

.05186 

.47831 

2.47 

.01889 

.49324 

1.58 

.11450 

.44295 

2.03 

.05082 

.47882 

2.48 

.01842 

.49343 

1.59 

.11270 

.44408 

2.04 

.04980 

.47932 

2,49 

.01797 

.49361 

1.60 

.11092 

.44520 

2.05 

.04879 

.47982 

2.50 

.01753 

.49379 

1.61 

.10915 

.44630 

2.06 

.04780 

.48030 

2.51 

.01709 

.49396 

1.62 

.10741 

.44738 

2.07 

.04682 

.48077 

2.52 

.01667 

.49413 

1.63 

.10567 

.44845 

2.08 

.04586 

.48124 

2.53 

.01625 

.49430 

1.64 

.10396 

.44950 

2.09 

.04491 

.48169 

2.54 

.01585 

.49446 

1.65 

.10226 

.45053 

2.10 

.04398 

.48214 

2.55 

.01545 

.49461 

1.66 

.10059 

.45154 

2.11 

.04307 

.48257 

2.56 

.01506 

.49477 

1.67 

.09893 

.45254 

2.12 

.04217 

.48300 

2.57 

.01468 

.49492 

1.68 

.09728 

.45352 

2.13 

.04128 

.48341 

2.58 

.01431 

.49506 

1.69 

.09566 

.45449 

2.14 

.04041 

.48382 

2.59 

.01394 

.49520 

1.70 

.09405 

.45543 

2.15 

.03955 

.48422 

2.60 

.01358 

.49534 

1.71 

.09246 

.45637 

2.16 

.03871 

.48461 

2.61 

.01323 

.49547 

1.72 

.09089 

.45728 

2.17 

.03788 

.48500 

2.62 

.01289 

.49560 

1.73 

.08933 

.45818 

2.18 

.03706 

.48537 

2.63 

.01256 

.49573 

1.74 

.08780 

.45907 

2.19 

.03626 

.48574 

2.64 

.01223 

.49585 

1.75 

.08628 

.45994 

2.20 

.03547 

.48610 

2.65 

.01191 

.49598 

1.76 

.08478 

.46080 

2.21 

.03470 

.48645 

2.66 

.01160 

.49609 

1.77 

.08329 

.46164 

2.22 

.03394 

.48679 

2.67 

.01130 

.49621 

1.78 

.08183 

.46246 

2.23 

.03319 

.48713 

2.68 

.01100 

.49632 

1.79 

.08038 

.46327 

2.24 

.03246 

.48745 

2.69 

.01071 

.49643 


226 


Table I. Ordinates and Areas of the Normal Curve, 4>(t) = —e-**/ 2 

V2tt 


t 


S 0 ‘<t>(.t)cU 

t 


J o 4>(t)dt 

t 

<#>«) 

So‘<Ht)dt 

2.70 

.01042 

.49653 

3.15 

.00279 

.49918 

3.60 

.00061 

.49984 

2.71 

.01014 

.49664 

3.16 

.00271 

.49921 

3.61 

.00059 

.49985 

2.72 

.00987 

.49674 

3.17 

.00262 

.49924 

3.62 

.00057 

.49985 

2.73 

.00961 

.49683 

3.18 

.00254 

.49926 

3.63 

.00055 

.49986 

2.74 

.00935 

.49693 

3.19 

.00246 

.49929 

3.64 

.00053 

.49986 

2.75 

.00909 

.49702 

3.20 

.00238 

.49391 

3.65 

.00051 

.49987 

2.76 

.00885 

.49711 

3.21 

.00231 

.49934 

3.66 

.00049 

.49987 

2.77 

.00861 

.49720 

3.22 

.00224 

.49936 

3.67 

.00047 

.49988 

2.78 

.00837 

.49728 

3.23 

.00216 

.49938 

3.68 

.00046 

.49988 

2.79 

.00814 

.49736 

3.24 

.00210 

.49940 

3.69 

.00044 

.49989 

2.80 

.00792 

.49744 

3.25 

.00203 

.49942 

3.70 

.00042 

.49989 

2.81 

.00770 

.49752 

3.26 

.00196 

.49944 

3.71 

.00041 

.49990 

2.82 

.00748 

.49760 

3.27 

.00190 

.49946 

3.72 

.00039 

.49990 

2.83 

.00727 

.49767 

3.28 

.00184 

.49948 

3.73 

.00038 

.49990 

2.84 

.00707 

.49774 

3.29 

.00178 

.49960 

3.74 

.00037 

.49991 

2.85 

.00687 

.49781 

3.30 

.00172 

.49962 

3.75 

.00035 

.49991 

2.86 

.00668 

.49788 

3.31 

.00167 

.49953 

3.76 

.00034 

.49992 

2.87 

.00649 

.49795 

3.32 

.00161 

.49955 

3.77 

.00033 

.49992 

2.88 

.00631 

.49801 

3.33 

.00156 

.49957 

3.78 

.00031 

.49992 

2.89 

.00613 

.49807 

3.34 

.00151 

.49958 

3.79 

.00030 

.49992 

2.90 

.00595 

.49813 

3.35 

.00146 

.49960 

3.80 

.00029 

.49993 

2.91 

.00578 

.49819 

3.36 

.00141 

.49961 

3.81 

.00028 

.49993 

2.92 

.00562 

.49825 

3.37 

.00136 

.49962 

3.82 

.00027 

.49993 

2.93 

.00545 

.49831 

3.38 

.00132 

.49964 

3.83 

.00026 

.49994 

2.94 

.00530 

.49836 

3.39 

.00127 

.49965 

3.84 

.00025 

.49994 

2.95 

.00514 

.49841 

3.40 

.00123 

.49966 

3.85 

.00024 

.49994 

2.96 

.00499 

.49846 

3.41 

.00119 

.49968 

3.86 

.00023 

.49994 

2.97 

.00485 

.49851 

3.42 

.00115 

.49969 

3.87 

.00022 

.49995 

2.98 

.00471 

.49856 

3.43 

.00111 

.49970 

3.88 

.00021 

.49995 

2.99 

.00457 

.49861 

3.44 

.00107 

.49971 

3.89 

.00021 

.49995 

3.00 

.00443 

.49865 

3.45 

.00104 

.49972 

3.90 

.00020 

.49995 

3.01 

.00430 

.49869 

3.46 

.00100 

.49973 

3.91 

.00019 

.49995 

3.02 

.00417 

.49874 

3.47 

.00097 

.49974 

3.92 

.00018 

.49996 

3.03 

.00405 

.49878 

3.48 

.00094 

.49975 

3.93 

.00018 

.49996 

3.04 

.00393 

.49882 

3.49 

.00090 

.49976 

3.94 

.00017 

.49996 

3.05 

.00381 

.49886 

3.50 

.00087 

.49977 

3.95 

.00016 

.49996 

3.06 

.00370 

.49889 

3.51 

.00084 

.49978 

3.96 

.00016 

.49996 

3.07 

.00358 

.49893 

3.52 

.00081 

.49978 

3.97 

.00015 

.49996 

3.08 

.00348 

.49897 

3.53 

.00079 

.49979 

3.98 

.00014 

.49997 

3.09 

.00337 

.49900 

3.54 

.00076 

.49980 

3.99 

.00014 

.49997 

3.10 

.00327 

.49903 

3.55 

.00073 

.49981 




3.11 

.00317 

.49906 

3.56 

.00071 

.49981 




3.12 

.00307 

.49910 

3.57 

.00068 

.49982 




3.13 

.00298 

.49913 

3.58 

.00066 

.49983 




3.14 

.00288 

.49916 

3.59 

.00063 

.49983 


Jk 



227 



Table II. Common Logarithms of Numbers to Five Decimal Places 



043 




043 I 087 I 130 | 173 217 260 



04139 179 218 258 



Prop. Parts 


44 

43 

43 

4.4 

4.3 

4.2 

8.8 

8.6 

8.4 


3 13.2 12.9 12.6 

4 17 6 17 2 16.8 

490 5 22.0 21.5 21.0 

898 6 26.4 25.8 25.2 

7 30.8 30.1 29.4 
*302 8 35.2 34.4 33.6 

703 9 39.6 38.7 37.8 


07 918 

954 

990 *027 

*063 

*099 

*135 

*171 

*207 

08 279 

314 

350 


422 

458 

493 

529 

565 

636 

672 

707 

ESI 

778 

814 

849 

884 

920 

08 991 

*026 

*061 

*096 

*132 

*167 

*202 

*237 

*272 

09 342 

377 

412 

447 

482 

517 

552 

587 

621 

09 691 

726 

760 

795 

830 

864 

899 

934 

968 

10 037 

072 

106 

140 

175 

209 

243 

278 

312 

■ 

380 

415 

449 

483 

517 

551 

585 

619 

653 

10 721 

755 

789 

823 

857 

890 

924 

958 

992 

11059 

093 

126 

160 

193 

227 

261 

294 

327 

394 

428 

461 

494 

528 

561 

594 

628 

661 


31 11 727 760 793 826 860 

32 12 057 090 123 156 189 

33 385 418 450 483 516 j 548 

34 12 710 743 775 808 840 

35 13 033 066 098 130 162 

36 354 386 418 450 481 

37 672 704 735 767 799 

38 13 988 *019 *051 *082 *114 

39 14 301 333 364 395 426 


41 

40 

39 

4.1 

4 

3.9 

8.2 

8 

7.8 

12.3 

12 

11.7 

16.4 

16 

15.6 

20.5 

20 

19.5 

24.6 

24 

23.4 

28.7 

28 

27.3 

32.8 

32 

31.2 

36.9 

36 

35.1 

38 

37 

36 

3:8 

3.7 

3.6 

7.6 

7.4 

7.2 

11.4 

11.1 

10.8 

15.2 

14.8 

14.4 

19.0 

18.5 

18.0 

22.8 

22.2 21.6 

26.6 

25.9 25.2 

30.4 

29.6 

28.8 

34.2 

33.3 

32.4 

35 

34 

33 



Reprinted by permission from “ Plane Trigonometry ” by Simmons and Gore, John Wiley 
& Sons, Inc. 















































Table II. Common Logarithms of Numbers to Five Decimal Places 


















































Table II. Common Logarithms of Numbers to Five Decimal Places 


Prop. Parts 


197 

218 

408 

429 


49 62 

250 39 794 


325 

346 

366 387 

408 

531 

552 

572 593 

613 

736 

756 

777 797 

818 

940 

960 

980 *001 

*021 

143 

163 

183 203 

224 

345 

365 

385 405 

425 

546 

566 

586 606 

626 

746 

766 

786 806 

826 

945 

965 

985 *005 

*025 

143 

163 

183 203 

223 


5961 616 
7921 811 
986 


4 

8.8 8.4 

5 

11.0 10.5 

6 

13.2 12.6 

7 

15.4 14.7 

8 

17.6 16.8 

9 

19.8 18.9 


20 

1 

2 

2 

4 

3 

6 

4 

8 

5 

10 

6 

12 

7 

14 

8 

16 

9 

18 


19 

1 

1.9 

2 

3.8 

3 

5.7 

4 

7.6 

5 

9.5 

6 

11.4 

7 

13.3 

8 

15.2 

9 

17.1 


18 

1 

1.8 

2 

3.6 

3 

5.4 


Prop. Parts 


230 

































Table II. Common Logarithms of Numbers to Five Decimal Places 


Prop. Parts 




16 

1 

1.6 

2 

3.2 

3 

4.8 

4 

6.4 

5 

8.0 

6 

9.6 

7 

11.2 

8 

12.8 

9 

14.4 


15 

1 

1.5 

2 

3.0 

3 

4.5 

4 

6.0 

5 

7.5 

6 

9.0 

7 

8 

9 

10.5 

12.0 

13.5 


14 

1 

1.4 

2 

2.8 

3 

4.2 

4 

5.6 



Prop. Parts 



231 





























Table II. Common Logarithms of Numbers to Five Decimal Places 


2 3 4 


8 9 


Prop. Parts 


01 

47 857 

871 

885 

900 

914 

929 

943 

958 

972 

986 

02 

48 001 

015 

029 

044 

058 

073 

087 

101 

116 

130 

03 

144 

159 

173 

187 

202 

216 

230 

244 

259 

273 

04 

287 

302 

316 

330 

344 

359 

373 

387 

401 

416 

05 

430 

444 

458 

473 

487 

501 

515 

530 

544 

558 

06 

572 

586 

601 

615 

629 

643 

657 

671 

686 

700 

07 

714 

728 

742 

756 

770 

785 

799 

813 

827 

841 

08 

855 

869 

883 

897 

911 

926 

940 

954 

968 

982 

09 

48 996 

*010 

*024 

*038 

*052 

*066 

*080 

*094 

*108 

*122 

310 

49 136 

150 

164 

178 

192 

206 

220 

234 

248 

262 

11 

276 

290 

304 

318 

332 

346 

360 

374 

388 

402 

12 

415 

429 

443 

457 

471 

485 

499 

513 

527 

541 

13 

554 

568 

582 

596 

610 

624 

638 

651 

665 

679 

14 

693 

707 

721 

734 

748 

762 

776 

790 

803 

817 

15 

831 

845 

859 

872 

886 

900 

914 

927 

941 

955 

16 

49 969 

982 

996 

*010 

*024 

*037 

*051 

*065 

*079 

*092 

17 

50 106 

120 

133 

147 

161 

174 

188 

202 

215 

229 

18 

243 

256 

270 

284 

297 

311 

325 

338 

352 

365 

19 

379 

393 

406 

420 

433 

447 

461 

474 

488 

501 

320 

515 

529 

542 

556 

569 

583 

596 

610 

623 

637 

21 

651 

664 

678 

691 

705 

718 

732 

745 

759 

772 

22 

786 

799 

813 

826 

840 

853 

866 

880 

893 

907 

23 

50 920 

934 

947 

961 

974 

987 

*001 

*014 

*028 

*041 

24 

51 055 

068 

081 

095 

108 

121 

135 

148 

162 

175 

25 

188 

202 

215 

228 

242 

255 

268 

282 

295 

308 

26 

322 

335 

348 

362 

375 

388 

402 

415 

428 

441 

27 

455 

468 

481 i 

495 

508 

521 

534 

548 

561 

574 

28 

587 

601 

614 

627 

640 

654 

667 

680 1 

693 

706 

29 

720 

733 

746 

759 

772 

786 

799 

812, 

825 

838 

330 

851 

865 

878 

891 

904 

917 

930 

943 

957 

970 

31 

51 983 

996 

*009 

*022 

*035 

*048 

*061 

*075 i 

*088 

*101 

32 

52 114 

127 

140 

153 

166 

179 

192 

205 1 

218 

231 

33 

244 

257 

270 

284 

297 

310 

323 

336 

349 

362 

34 

375 

388 

401 

414 

427 

440 

453 

466 

479 

492 

35 

504 

517 

530 

543 

556 

569 

582 

595 

608 

621 

36 

634 

647 

660 

673 

686 

699 

711 

724 

737 

750 

37 

763 

776 

789 

802 

815 

827 

840 

853 

866 

879 

38 

52 892 

905 

917 

930 

943 

956 

969 

982 

994 

*007 

39 

53 020 

033 

046 

058 

071 

084 

097 

110 

122 

135 

340 

148 

161 

173 

186 

199 

212 

224 

237 

250 

263 

41 

275 

288 

301 

314 

326 

339 

352 

364 

377 

390 

42 

403 

415 

428 

441 

453 

466 

479 

491 

504 

517 

43 

529 

542 

555 

567 

580 

593 

605 

618 

631 

643 

44 

656 

668 

681 

694 

706 

719 

732 

744 

757 

769 

45 

782 

794 

807 

820 

832 

845 

857 

870 

882 

895 

46 

53 908 

920 

933 

945 

958 

970 

983 

995 

*008 

*020 

47 

54 033 

045 

058 

070 

083 

095 

108 

120 

133 

145 

48 

158 

170 

183 

195 

208 

220 

233 

245 

258 

270 

49 

283 

295 

307 

320 

332 

345 

357 

370 

382 

394 

350 

54 407 

419 

432 

444 

456 

469 

481 

494 

506 

518 



4 

6.0 

5 

7.5 

6 

9.0 

7 

10.5 

8 

12.0 

9 

13.5 


14 

1 

1.4 

2 

2.8 

3 

4.2 

4 

5.6 

5 

7.0 

€ 

8.4 

7 

9.8 

8 

11.2 

9 

12.6 


13 

1 

1.3 

2 

2.6 

3 

3.9 

4 

5.2 

5 

6.5 

6 

7.8 

7 

9.1 

8 

10.4 

9 

11.7 


12 

1 

1.2 

2 

2.4 

3 

3.6 

4 

4.8 

5 

6.0 

6 

7.2 

7 

8.4 

8 

9.6 

9 

10.8 


2 3 


Prop. Parts 


232 

















Table II. Common Logarithms of Numbers to Five Decimal Places 


|( Prop. Parts 

O 


1 

2 

3 

4 

5 

( 6 

7 

8 

9 



350 

54 407 

419 

432 

444 

456 

469 

481 

494 

506 

518 



51 

531 

543 

555 

568 

580 

593 

605 

617 

630 

642 



52 

654 

667 

679 

691 

704 

716 

728 

741 

753 

765 


13 

53 

777 

790 

802 

814 

827 

839 

851 

864 

876 

888 

1 

1.3 

54 

54 900 

913 

925 

937 

949 

962 

974 

986 

998 

*011 

2 

2.6 

3.9 

5.2 

55 

55 023 

035 

047 

060 

072 

084 

096 

108 

121 

133 

4 

56 

145 

157 

169 

182 

194 

206 

218 

230 

242 

255 

5 

6 

6.5 

7.8 

57 

267 

279 

291 

303 

315 

328 

340 

352 

364 

376 


9.1 

10.4 

11.7 

58 

388 

400 

413 

425 

437 

449 

461 

473 

485 

497 

8 

59 

509 

522 

534 

546 

558 

570 

582 

594 

606 

618 

9 

360 

630 

642 

654 

666 

678 

691 

703 

715 

727 

739 



61 

751 

763 

775 

787 

799 

811 

823 

835 

847 

859 



62 

871 

883 

895 

907 

919 

931 

943 

955 

967 

979 



63 

55 991 

*003 

*015 

*027 

*038 

*050 

*062 

*074 

*086 

*098 



64 

56 110 

122 

134 

146 

158 

170 

182 

194 

205 

217 



65 

229 

241 

253 

265 

277 

289 

301 

312 

324 

336 


12 

66 

348 

360 

372 

384 

396 

407 

419 

431 

443 

455 

1 

1.2 












2 

2.4 

67 

467 

478 

490 

502 

514 

526 

538 

549 

561 

573 

3 

3.6 

68 

585 

597 

608 

620 

632 

644 

656 

667 

679 

691 

f 4 

4.8 

69 

703 

714 

726 

738 

750 

761 

773 

785 

797 

808 

5 

6.0 

7.2 












6 

7 

370 

820 

832 

844 

855 

867 

879 

891 

902 

914 

926 










*008 

*019 



8 

9.6 

71 

56 937 

949 

961 

972 

984 

996 

*031 

*043 

9 

10.8 

72 

57 054 A 

066 

078 

089 

101 

113 

124 

136 

148 

159 



73 

171 

183 

194 

206 

217 

229 

241 

252 

264 

276 



74 

287 

299 

310 

322 

334 

345 

357 

368 

380 

392 



75 

403 

415 

426 

438 

449 

461 

473 

484 

496 

507 



76 

519 

| 530 

542 

553 

565 

576 

588 

600 

611 

623 



77 

634 

646 

657 

669 

680 

692 

703 

715 

726 

738 


11 

1.1 

78 

749 

761 

772 

784 

795 

807 

818 

830 

841 

852 

1 

79 

864 

875 

887 

898 

1 910 

921 

933 

944 

955 

967 

2 

3 

2.2 

3.3 

380 

57 978 

990 

*001 

*013 

*024 

*035 

*047 

*058 

*070 

*081 

4 

e 

4.4 

5.5 

6.6 

81 

58 092 

104 

115 

127 

138 

149 

161 

172 

184 

195 

O 

6 

82 

206 

218 

229 

240 

1 252 

263 

274 

286 

297 

309 

7 

7.7 

83 

320 

331 

343 

354 

365 

377 

388 

399 

410 

422 

8 

9 

8.8 

9.9 

84 

433 

444 

456 

467 

I 478 

490 

501 

512 

524 

535 


85 

546 

557 

569 

580 

591 

602 

614 

625 

636 

647 



86 

659 

670 

681 

692 

704 

715 

726 

737 

749 

760 



87 

771 

782 

794 

805 

816 

827 

838 

850 

861 

872 



88 

883 

894 

906 

917 

928 

939 

950 

961 

973 

984 



89 

58 995 

*006 

*017 

*028 

*040 

*051 

*062 

*073 

*084 

*095 


10 

390 

59 106 

118 

129 

140 

151 

162 

173 

184 

195 

207 

1 

1.0 

91 

218 

229 

240 

251 

262 

273 

284 

295 

306 

318 

2 

2.0 

92 

329 

340 

351 

362 

373 

384 

395 

406 

417 

428 

3 

3.0 

93 

439 

450 

461 

472 

, 483 

494 

506 

517 

528 

539 

4 

4.0 












5 

5.0 

94 

550 

561 

572 

583 

594 

605 

616 

627 

638 

649 

6 

6.0 

95 

660 

671 

682 

693 

704 

715 

726 

737 

748 

759 

7 

7.0 

96 

770 

780 

791 

802 

813 

824 


846 

857 

868 

8 

8.0 








1 835 



9 

9,0 

97 

879 

890 

901 

912 

923 

934 

945 

956 

966 

977 



98 

59 988 

999 

*010 

*021 

*032 

*043 

*054 

*065 

*076 

*086 



99 

60 097 

108 

119 

130 

141 

152 

163 

173 

184 

1 195 



400 

60 206 

217 

228 

239 

249 

260 

271 

282 

293 

1 304 

| Prop. Parts 


0 

1 

2 

3 

4 

5 

6 

7 

8 

9 
























Table II. Common Logarithms of Numbers to Five Decimal Places 


[El 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

Prop . Parts | 

400 

60 206 

217 

228 

239 

249 

260 

271 

282 

293 

304 



01 

314 

325 

336 

347 

358 

369 

379 

390 

401 

412 



02 

423 

433 

444 

455 

466 

477 

487 

498 

509 

520 



03 

531 

541 

552 

563 

574 

584 

595 

606 

617 

627 



04 

638 

649 

660 

670 

681 

692 

703 

713 

724 

735 



05 

746 

756 

767 

778 

788 

799 

810 

821 

831 

842 



06 

853 

863 

874 

885 

895 

906 

917 

927 

938 

949 


11 

1.1 

07 

60 959 

970 

981 

991 

*002 

*013 

*023 

*034 

*045 

*055 

1 

08 

61 066 

077 

087 

098 

109 

119 

130 

140 

151 

162 

2 

2.2 

09 

172 

183 

194 

204 

215 

225 

236 

247 

257 

268 

3 

3.3 











4 

4.4 

410 

278 

289 

300 

310 

321 

331 

342 

352 

363 

374 

5 

6 

5.5 

6.6 

11 

384 

395 

405 

416 

426 

437 

448 

458 

469 

479 

7 

7.7 

12 

490 

500 

511 

521 

532 

542 

553 

563 

574 

584 

8 

8.8 

13 

595 

606 

616 

627 

637 

648 

658 

669 

679 

690 

9 

9.9 

14 

700 

711 

721 

731 

742 

752 

763 

773 

784 

794 



15 

805 

815 

826 

836 

847 

857 

868 

878 

888 

899 



16 

61909 

920 

930 

941 

951 

962 

972 

982 

993 

*003 



17 

62 014 

024 

034 

045 

055 

066 

076 

086 

097 

107 



18 

118 

128 

138 

149 

159 

170 

180 

190 

201 

211 



19 

221 

232 

242 

252 

263 

273 

284 

294 

304 

315 



420 

325 

335 

346 

356 

366 

377 

387 

397 

408 

418 



21 

428 

439 

449 

459 

469 

480 

490 

500 

511 

521 


10 

22 

531 

542 

552 

562 

572 

583 

593 

603 

613 

624 


23 

634 

644 

655 

665 

675 

685 

696 

706 

716 

726 

1 

1.0 







2 

2.0 

24 

737 

747 

757 

767 

778 

788 

798 

808 

818 

829 

3 

3.0 

25 

839 

849 

859 

870 

880 

890 

900 

910 

921 

931 

4 

4.0 

26 

62 941 

951 

961 

972 

982 

992 

*002 

*012 

*022 

*033 

5 

6 

5.0 

6.0 

27 

63 043 

053 

063 

073 

083 

094 

104 

114 

124 

134 

7 

8 

9 

7.0 

8 0 

28 

144 

155 

165 

175 

185 

195 

205 

215 

225 

236 

9i0 

29 

246 

256 

266 

276 

286 

296 

306 

317 

327 

337 

430 

347 

357 

367 

377 

387 

397 

407 

417 

428 

438 



31 

448 

458 

468 

478 

488 

498 

508 

518 

528 

538 



32 

548 

558 

568 

579 

589 

599 

609 

619 

629 

639 



33 

649 

659 

669 

679 

689 

699 

709 

719 

729 

739 



34 

749 

759 

769 

779 

789 

799 

809 

819 

829 

839 



35 

849 

859 

869 

879 

889 

899 

909 

919 

929 

939 



36 

63 949 

959 

969 

979 

988 

998 

*008 

*018 

*028 

*038 



37 

64 048 

058 

068 

078 

088 

098 

108 

118 

128 

137 


9 

38 

147 

157 

167 

177 

187 

197 

207 

217 

227 

237 

*f 

0.9 

1.8 

39 

246 

256 

266 

276 

286 

296 

306 

316 

326 

335 

1 

2 1 

440 

345 

355 

365 

375 

385 

395 

404 

414 

424 

434 

3 

4 

2.7 

3.6 

41 

444 

454 

464 

473 

483 

493 

503 

513 

523 

532 

5 

0 

4.5 

5.4 

42 

542 

552 

562 

572 

582 

591 

601 

611 

621 

631 

ty 

6.3 

7.2 

43 

640 

650 

660 

670 

680 

689 

699 

709 1 

719 

729 

4 

8 

44 

738 

748 

758 

768 

777 

787 

797 

807 

816 

826 

9 

8.1 

45 

836 

846 

856 

865 

875 

885 

895 

904 

914 

924 



46 

64 933 

943 

953 

963 

972 

982 

992 

*002 

*011 

*021 



47 

65 031 

040 

050 

060 

070 

079 

089 

099 

108 

118 



48 

128 

137 

147 

157 

167 

176 

186 

196 

205 

215 



49 

225 

234 

244 

254 

263 

273 

283 

292 

302 

312 



450 

65 321 

331 

341 

350 

360 

369 

379 

389 

398 

408 



m 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

Prop . Parts 
















Table II. Common Logarithms of Numbers to Five Decimal Places 


Prop . Parts 



Prop . Parts 



450 65 32 


51 418 427 437 447 456 466 475 485 495 504 

52 514 523 533 543 552 562 571 581 591 600 

53 610 619 629 639 648 658 667 677 686 696 

54 706 715 725 734 744 753 763 772 782 792 

55 801 811 820 830 839 849 858 868 877 887 

56 896 906 916 925 935 944 954 963 973 982 

57 65 992 *001 *011 *020 *030 *039 *049 *058 *068 *077 

58 66 087 096 106 115 124 134 143 153 162 172 

59 181 191 200 210 219 229 238 247 257 266 


460 276 285 295 304 314 323 332 342 351 361 

61 370 380 389 398 408 417 427 436 445 ~455 

62 464 474 483 492 502 511 521 530 539 549 

63 558 567 577 586 596 605 614 624 633 642 

64 652 661 671 680 689 699 708 717 727 736 

65 745 755 764 773 783 792 801 811 820 829 

66 839 848 . 857 867 876 885 894 904 913 922 

67 66 932 941 950 960 969 978 987 997 *006 *015 

68 67 025 034 043 052 062 071 080 089 099 108 

69 117 127 136 145 154 164 173 182 191 201 


210 219 228 237 247 256 265 274 284 293 


71 302 311 321 330 339 348 357 367 376 385 

72 394 403 413 422 431 440 449 459 468 477 

73 486 495 504 514 523 532 541 550 560 569 

74 578 587 596 605 614 624 633 642 651 660 

75 669 679 688 697 706 715 724 733 742 752 

76 761 770 779 788 797 806 815 825 834 843 


77 852 

78 67 943 

79 68 034 


81 215 

82 305 

83 395 

84 485 

85 574 

86 664 

87 753 

88 842 

89 68 931 


870 879 
961 970 
052 061 


888 897 
979 988 
070 079 


906 916 925 934 
997 *006 *015 *024 
088 097 106 115 


142 151 160 169 178 187 196 205 


233 242 
323 332 
413 422 

502 511 
592 601 
681 690 

771 780 
860 869 
949 958 


251 260 
341 350 
431 440 

520 529 
610 619 
699 708 

789 797 
878 886 
966 975 


269 278 287 296 
359 368 377 386 
449 458 467 476 

538 547 556 565 
628 637 646 655 
717 726 735 744 

806 815 824 833 
895 904 913 922 
984 993 *002 *011 


69 020 028 037 046 055 064 073 082 090 099 


108 117 126 135 144 152 161 170 179 188 

197 205 214 223 232 241 249 258 267 276 

285 294 302 311 320 329 338 346 355 364 

373 381 390 399 408 417 425 434 443 452 

461 469 478 487 496 504 513 522 531 539 

548 557 566 574 583 592 601 609 618 627 

636 644 653 662 671 679 688 697 705 714 

723 732 740 749 758 767 775 784 793 801 

810 819 827 836 845 854 862 871 880 888 

69 897 906 914 923 932 940 949 958 966 975 


2 | 3 I 4 | 5 


235 































Table II. Common Logarithms of Numbers to Five Decimal Places 


■■1 

EH 

a 

EH 

EH 

EH 

6 


Li 

8 

9 

| 69 897 

906 

914 

923 

932 

940 

949 

958 

966 

975 


Prop. Parts 


01 69 984 992 *001 *010 *018 *027 *036 *044 *053 *062 

02 70 070 079 088 096 105 114 122 131 140 148 

03 157 165 174 183 191 200 209 217 226 234 

04 243 252 260 269 278 286 295 303 312 321 

05 329 338 346 355 364 372 381 389 398 406 

06 415 424 432 441 449 458 467 475 484 492 

07 501 509 518 526 535 544 552 561 569 578 

08 586 595 603 612 621 629 638 646 655 663 

09 672 680 689 697 706 714 723 731 740 749 

510 757 766 774 783 791 800 808 817 825 834 

11 842 851 859 868 876 885 893 902 910 919 

12 70 927 935 944 952 961 969 978 986 995 *003 

13 71 012 020 029 037 046 054 063 071 079 088 

14 096 105 113 122 130 139 147 155 164 172 

15 181 189 198 206 214 223 231 240 248 257 

16 265 273 282 290 299 307 315 324 332 341 

17 349 357 366 374 383 391 399 408 416 425 

18 433 441 450 458 466 475 483 492 500 508 

19 517 525 533 542 550 559 567 575 584 592 

600 609 617 625 634 642 650 659 667 675 


21 684 692 700 709 717 725 734 742 750 759 

22 767 775 784 792 800 809 817 825 834 842 

23 850 858 867 875 883 892 900 908 917 925 

24 71933 941 950 958 966 975 983 991 999 *008 

25 72 016 024 032i 041 049 057 066 074 082 090 

26 099 107 115 123 132 140 148 156 165 173 

27 181 189 198 206 214 222 230 239 247 255 

28 263 272 280 288 296 304 313 321 329 337 

29 346 354 362 370 378 387 395 403 411 419 


530 428 436 444 452 460 469 477 485 493 501 

31 509 518 526 534 542 550 558 567 575 583 

32 591 599 607 616 624 632 640 648 656 665 

33 673 681 689 697 705 713 722 730 738 746 

34 754 762 770 779 787 795 803 811 819 827 

35 835 843 852 860 868 876 884 892 900 908 

36 916 925 933 941 949 957 965 973 981 989 

37 72 997 *006 *014 *022 *030 *038 *046 *054 *062 *070 

38 73 078 086 094 102 111 119 127 135 143 151 

39 159 167 175 183 191 199 207 215 223 231 

540 239^ 247 255 263 272 280 288 296 304 312 

~ 41 ~ 320 328 336 344 352 360 368 376 384 392 

42 400 408 416 424 432 440 448 456 464 472 

43 480 488 496 504 512 520 528 536 544 552 

44 560 568 576 584 592 600 608 616 624 632 

45 640 648 656 664 672 679 687 695 703 711 

46 719 727 735 743 751 759 767 775 783 791 

47 799 807 815 823 830 838 846 854 862 870 

48 878 886 894 902 910 918 926 933 941 949 

49 73 957 965 973 981 989 997 *005 *013 *020 *028 

550 74 036 044 052 060 068 076 084 092 099 107 


2 


Prop. Parts 


236 
















Table II. Common Logarithms of Numbers to Five Decimal Places 


3 4 15 6 


74 036 044 052 060 068 076 084 092 099 107 


51 115 

52 194 

53 273 

54 351 

55 429 

56 507 

57 586 

58 663 

59 741 


044 052 

123 131 
202 210 
280 288 

359 367 
437 445 
515 523 

593 601 
671 679 
749 757 


139 147 155 162 170 178 186 

218 225 233 241 249 257 265 

296 304 312 320 327 335 343 

374 382 390 398 406 414 421 

453 461 468 476 484 492 500 

531 539 547 554 562 570 578 

609 617 624 632 640 648 656 

687 695 702 710 718 726 733 

i 764 772 780 788 796 803 811 


827 834 842 850 858 865 873 881 889 

61 896 904 912 920 927 935 943 950 958 966 

62 74 974 981 989 997 *005 *012 *020 *028 *035 *043 

63 75 051 059 066 074 082 089 097 105 113 120 

64 128 136 143 151 159 166 174 182 189 197 

65 205 213 220 228 236 243 251 259 266 274 

66 282 289 297 305 312 320 328 335 343 351 

67 358 366 374 381 389 397 404 412 420' 427 

68 435 442 450] 458 465 473 481 488 496 504 

69 511 519 526 534 542 549 557 565 572 580 

570 58 V 595 603 610 618 626 633 641 648 656 

71 664 671 679 686 694 702 709 717 724 732 

72 740 747 755 762 770 778 785 793 800 808 

73 815 823 831 838 846 853 861 868 87d 884 

74 891 899 906 914 921 929 937 944 952 959 

75 75 967 974 982 989 997 *005 *012 *020 *027 *035 

76 76 042 050 057 065 072 080 087 095 103 110 


133 140 
208 215 
283 290 


163 170 
238 245 
313 320 


178 185 
253 260 
328 335 


433 440 
507 515 
582 589 

656 664 
730 738 
805 812 

879 886 
953 960 
026 034 


462 470 
537 545 
612 619 

686 693 
760 768 
834 842 

908 916 
982 989 
056 063 


477 485 
552 559 
626 634 

701 708 
775 782 
849 856 

923 930 
997 *004 
070 078 


173 181 
247 254 
320 327 


203 210 
276 283 
349 357 


217 225 
291 298 
364 371 


Prop. Parts 


94 379 386 393 401 408 415 422 430 437 444 

95 452 459 466 474 481 488 495 503 510 517 

96 525 532 539 546 554 561 568 576 583 590 

97 597 605 612 619 627 634 641 648 656 663 

98 670 677 685 692 699 706 714 721 728 735 

99 743 750 757 764 772 779 786 793 801 808 

600 77 815 822 830 837 844 851 859 866 873 880 


0 | 1 | 2 I 3 I 4 | 5 


237 
























Table II. Common Logarithms of Numbers to Five Decimal Places 


- 

Prop. 

Parts 


8 

1 

0.8 

2 

1.6 

3 

2.4 

4 

3.2 1 

5 

4.0 

6 

4.8 

7 

5.6 

8 

6.4 

9 

7.2 


7 

1 

0.7 

2 

1.4 

3 

2.1 

4 

2.8 

5 

3.5 

6 

4.2 

7 

4.9 

8 

5.6 

9 

6.3 


6 

1 

0.6 

2 

1.2 

3 

1.8 

4 

2.4 

5 

3.0 

6 

3.6 

71 

4.2 

8 

4.8 

9 

5.4 


2 3 4 5 


600 77 815 822 830 837 844 851 859 866 873 880 

01 887 895 902 909 916 924 931 938 945 952 

02 77 960 967 974 981 988 996 *003 *010 *017 *025 

03 78 032 039 046 053 061 068 075 082 089 097 

04 104 111 118 125 132 140 147 154 161 168 

05 176 183 190 197 204 211 219 226 233 240 

06 247 254 262 269 276 283 290 297 305 312 

07 319 326 333 340 347 355 362 369 376 383 

08 390 398 405 412 419 426 433 440 447 455 

09 462 469 476 483 490 497 504 512 519 526 

610 533 540 547 554 561 569 576 583 590 597 


11 604 611 618 625 633 640 647 654 661 668 

12 675 682 689 696 704 711 718 725 732 739 

13 746 753 760 767 774 781 789 796 803 8101 

14 817 824 831 838 845 852 859 866 873 880 

15 888 895 902 909 916 923 930 937 944 951 

16 78 958 965 972 979 986 993 *000 *007 *014 *021 

17 79 029 036 043 050 057 064 071 078 085 092 

18 099 106 113 120 127 134 141 148 155 162 

19 169 176 183 190 197 204 211 218 225 232 

620 23ST 246 253 260 267 274 281 288 295 302 

21 309 316 323 330 337 344 351 358 365 372 

22 379 386 393 400 407 414 421 428 435 442 

23 449 456 463 470 477 484 491 498 505 511 

24 518 525 532 539 546 553 560 567 574 581 

25 588 595 602 609 616 623 630 637 644 650 

26 657 664 671 678 685 692 699 706 713 720 

27 727 734 741 748 754 761 768 775 782 789 

28 796 803 810 817 824 831 837 844 851 858 

29 865 872 879 886 893 900 906 913 920 927 

630 79 934 941 948 955 962 969 975 982 989 996 

31 80 003 010 017 024 030 037 044 051 058 065 

32 072 079 085 092 099 106 113 120 127 134 

33 140 147 154 161 168 1 75 182 188 195 202 

34 209 216 223 229 236 243 250 257 264 271 

35 277 284 291 298 305 312 318 325 332 339 

36 346 353 359 366 373 380 387 393 400 407 

37 414 421 428 434 441 448 455 462 468 475 

38 482 489 496 502 509 516 523 530 536 543 

39 550 557 564 570 577 584 591 598 604 611 

640 61S ~625 632 638 645 652 659 665 672 679 

41 686^ 693 699 706 713 720 726 733 740 747 

42 754 760 767 774 781 787 794 801 808 814 

43 821 828 835 841 848 855 862 868 875 882 

44 889 895 902 909 916 922 929 936 943 949 

45 80 956 963 969 976 983 990 996 *003 *010 *017 

46 81023 030 037 043 050 057 064 070 077 084 

47 090 097 104 111 117 124 131 137 144 151 

48 158 164 171 178 184 191 198 204 211 218 

49 224 231 238 245 251 258 265 271 278 285 

81 291 298 305 311 318 325 331 338 345 351 


0 12 3 4|5|6| 


Prop. Parts 


238 






















Table II. Common Logarithms op Numbers to Five Decimal Places 



239 




















Table II. Common Logarithms of Numbers to Five Decimal Places 



0 |1|2|3|4|5 6 


84 510 516 522 528 535 541 547 553 559 566 

01 572 578 584 590 597 603 609 615 621 628 

02 634 640 646 652 658 665 671 677 683 689 

03 696 702 708 714 720 726 733 739 745 751 

04 757 763 770 776 782 788 794 800 807 813 

05 819 825 831 837 844 850 856 862 868 874 

06 880 887 893 899 905 911 917 924 930 936 

07 84 942 948 954 960 967 973 979 985 991 997 

08 85 003 009 016 022 028 034 040 046 052 058 

09 065 071 077 083 089 095 101 107 114 120 

710 12? 132 138 144 150 156 163 169 175 181 

Tl 187 193 199 20? 211 217 224 230 236 242 

12 248 254 260 266 272 278 285 291 297 303 

13 309 315 321 327 333 339 345 352 358 364 

14 370 376 382 388 394 400 406 412 418 425 

15 431 437 443 449 455 461 467 473 479 485 

16 491 497 503 509 516 522 528 534 540 546 

17 552 558 564 570 576 582 588 594 600 606 

18 612 618 625 631 637 643 649 655 661 667 

19 673 679 685 691 697 703 709 715 721 727 

72? 73? 739 745 751 757 763 769 775 781 788 

~21 79? 800 806 812 818 824 830 836 842 848 

22 854 860 866 872 878 884 890 896 902 908 

23 914 920 926 932 938 944 950 956 962 968 

24 85 974 980 986 992 998 *004 *010 *016 *022 *028 

25 86 034 040 046 052 058 064 070 076 082 088 

26 094 100 106 112 118 124 130 136 141 147 

27 153 159 165 171 177 183 189 195 201 207 

28 213 219 225 231 237 243 249 255 261 267 

29 273 279 285 291 297 303 308 314 320 326 

730 33? 338 344 350 356 362 368 374 380 386 1 

31 392 398 404 410 415 421 427 433 439 445 

32 451 457 463 469 475 481 487 493 499 504 

33 510 516 522 528 534 540 546 552 558 564 

34 570 576 581 587 593 599 605 611 617 623 

35 629 635 641 646 652 658 664 670 676 682 

36 688 694 700 705 711 717 723 729 735 741 


747 753 759 
806 812 817 
864 870 876 


770 776 782 788 794 800 

829 835 841 847 853 859 

888 894 900 906 911 917 


923 929 935 

86 982 988 994 

87 040 046 052 
099 105 111 

157 163 169 
216 221 227 

274 280 286 

332 338 344 
390 396 402 
448 454 460 

87 506 512 518 


947 953 958 964 970 976 

*005 *011 *017 *023 *029 *035 

064 070 075 081 087 093 

122 128 134 140 146 151 

181 186 192 198 204 210 

239 245 251 256 262 268 

297 303 309 315 320 326 

355 361 367 373 379 384 

413 419 425 431 437 442 

471 477 483 489 495 500 

529 535 541 547 552 558 


6 


8 

4.8 

9 

5.4 


1 * 

1 


2 


3 

ilH 


Prop. Parts 


240 




















Table II. Common Logarithms of Numbers to Five Decimal Places 


Prop. Parts 


87 506 512 518 

51 564" 570 576 

52 622 628 633 

53 679 685 691 

54 737 743 749 

55 795 800 806 

56 852 858 864 

57 910 915 921 

58 87 967 973 978 

59 88 024 030 036 

081 087 093 

138 144 150 

195 201 207 

252 258 264 

309 315 321 
366 372 377 
423 429 434 

480 485 491 
536 542 547 

593 598 604 


6 I 7 8 


523 529 535 541 547 552 558 

581 587 593 599 604 610 616 

639 645 651 656 662 668 674 

697 703 708 714 720 726 731 

754 760 766 772 777 783 789 

812 818 823 829 835 841 846 

869 875 881 887 892 898 904 

927 933 938 944 950 955 961 

984 990 996 *001 *007 *013 *018 

041 047 053 058 064 070 076 

098 104 110 116 121 127 133 f 

156 16f 167 173 178 184 190 

213 218 224 230 235 241 247 

270 275 281 287 292 298 304 

326 332 338 343 349 355 360 

383 389 395 400 406 412 417 

440 446 451 457 463 468 474 

497 502 508 513 519 525 530 

553 559 564 570 576 581 587 

610 615 621 627 632 638 643, 


Prop. Parts 


649 655 660 666 672 677 683 689 694 7001 

705*711 717 722 728 734 739 745 750 756 

762 767 773 779 784 790 795 801 807 812 J 

818 824 829 835 840 846 852 857 863 868 

874 880 885 891 897 902 908 913 919 925 

930 936 941 947 953 958 964 969 975 981 

88 986 992 997 *003 *009 *014 *020 *025 *031 *037 

89 042 048 053 059 064 070 076 081 087 092 

098 104 109 115 120 126 131 137 143 148 

154 159 165 170 176 182 187 195 198 204 

209~ 215 221 226 232 237 243 248 254 260 

265" 271 276 282 287 293 298 304 310 315 

321 326 332 337 343 348 354 360 365 371 

376 382 387 393 398 404 409 415 421 426 

432 437 443 448 454 459 465 470 476 481 

487 492 498 504 509 515 520 526 531 537 

542 548 553 559 564 570 575 581 586 592 

597 603 609 614 620 625 631 636 642 647 

653 658 664 669 675 680 686 691 697 702 

708 713 719 724 730 735 741 746 752 757 

763^ 768 774 779 785 790 796 801 807 812 

818 823 829 834 840 845 851 856 862 867 

873 878 883 889 894 900 905 911 916 922 

927 933 938 944 949 955 960 966 971 977 

89 982 988 993 998 *004 *009 *015 *020 *026 *031 

90 037 042 048 053 059 064 069 075 080 086 

091 097 102 108 113 119 124 129 135 140 

146 151 157 162 168 173 179 184 189 195 

200 206 211 217 222 227 233 238 244 249 

255 260 266 271 276 282 287 293 298 304 

90 309^ 314 320 325 331 336 342 347 352 358 


3l4l5i6l7l8l9 


241 















II. Common Logarithms of Numbers to Five Decimal Places 



27 751 

28 803 

29 


908 

913 

918 

924 

929 

934 

939 

944 


91 960 

965 

971 

976 

981 

986 

991 

997 

1*002 *007 

92 012 

018 

023 

028 

033 

038 

044 

049 

054 059 

065 

070 

075 

080 

085 

091 

096 

101 

106 111 

117 

122 

127 

132 

137 

143 

148 

153 

158 163 

169 

174 

179 

184 

189 

195 

200 

205 

210 215 

221 

226 

231 

236 

241 

247 

252 

257 

262 267 

273 

278 

283 

288 

293 

298 

304 

309 

314 319 

324 

330 

335 

340 

345 

350 

355 

361 

366 371 

376 

381 

387 

392 

397 

402 

407 

412 

418 423 


428 433 438 443 449 454 459 464 469 474 


41 480 485 490 495 500 505 511 516 521 526 

42 531 536 542 547 552 557 562 567 572 578 

43 583 588 593* 598 603 609 614 619 624 629 

44 634 639 645 650 655 660 665 670 675 681 

45 686 691 696 701 706 711 716 722 727 732 

46 737 742 747 752 758 763 768 773 778 783 

47 788 793 799 804 809 814 819 824 829 834 

48 840 845 850 855 860 865 870 875 881 886 

49 891 896 901 906 911 916 921 927 932 937 

850 92 942 947 952 957 962 967 973 978 983 988 


6 


Prop. Parts 


242 










































Table II. Common Logarithms of Numbers to.Five Decimal Places 










































Table II. Common Logarithms of Numbers to Five Decimal Places 



244 , 



















Table II. Common Logarithms of Numbers to Five Decimal Places 































INDEX 


Arithmetic mean, 32 
short method of computing, 38 
of sub-sets, 41, 97 
Array, 180 

Asymmetry, see skewness 
Averages, Chapter III 

discussion of different, 49, 51, 
53-57 

Average deviation, see mean devi¬ 
ation 

Charlier check, 64, 65, 84 
Charts, 23 
ratio, 149 

Classification of data, 8-14 
Class 

boundary, 14 
interval, 12 
limits, 14 
marks, 10 
mid-value of, 10 
Coefficient 
of alienation, 173 
of correlation, Chapter VII 
of variation, 86 
Collateral reading, 5 
Combination of sets, 94 
Compound interest law, 52, 151 
Computing machines, 4, 68 
Correlation 

and regression, 167 
coefficient, Chapter VIII 
linear, 161 
non-linear, 198 ff. 
rank, 209 
ratio, 198 ff. 

relation to common causes, 214 
interpretation of, 213 
intraclass, 221 


surface, 193 ff. 
table, 177 ff. 

Cumulative frequencies, 15, 27, 
125 

Curve of error, see normal curve 
Curve fitting, Chapter VII 
Curves of growth, 52, 145, 154, 
157 

Deviation, 35 

mean or average, 81 
root-mean-square, 83n, 95 
Dispersion, see measures of, 
relative, 86 

Estimate, standard error of, 168 
Frequency 

curves, 24, 107 
distributions, Chapter I 
graphical representation of, 
Chapter II 
polygon, 23 

Function, definition, 22 
exponential, 145 
frequency, 107 
linear, 131 
parabolic, 152 
quadratic, 133 
Geometric mean, 51 
Gompertz curve, 154 
Graduation by means of normal 
curve, 122 

Graphical representation, Chap¬ 
ters II, VII 
Harmonic mean, 53 
Histogram, 23 
Kurtosis, 71, 106 
Least-squares method, 139 
Logarithmic charts, see ratio 
charts 


247 



248 


Index 


Logistic curve, see Pearl-Reed 
curve 

Makeham’s law, 157 
Mean 

arithmetic, 32 ff. 
geometric, 51 
harmonic, 53 
of means, 41 
Mean deviation, 81 
Measures of dispersion, Chap¬ 
ter Y 

mean deviation, 81 
quartiles, 79 

semi-interquartile range, 79 
standard deviation, 83 ff. 
Median, 46 ff. 

Mode, 45 

Moments of a distribution, Chap¬ 
ter IV 

method of, 135 
Normal curve, Chapter VI 
explanation of tables of, 110 
fitted to observed data, 118 
properties of, 112 
standard form of, 109 
Normal equations, 139, 144, 153 
Ogive, 26 

Parabola, fitting a, 152 
Parameter, 119, 135, 146 
Pearl-Reed curve, 157 
Predictions, reliability of, 196 
Probability paper, 125 
Quartiles, 79 
of normal curve, 113 
Range, 15 
Ratio charts, 149 


Regression 
coefficients of, 166 
linear, 165 
non-linear, 198 
testing linearity of, 204 
Scatter diagram, 160 
Semi-logarithmic paper, 149 
Sheppard’s corrections, 75, 84 
Skewness, 104 
Snedecor, G. W., 167 
Standard units, 67 
Statistic, 119 
Standard deviation, 66 
of combination of sets, 94 
of grouped data, 83 
of ungrouped data, 89 ff. 
Straight line, 131 
fitting to data, 134 
Symmetry, 71, 104-107 
Tables 

areas under normal curve, 
Appendix 

logarithms of numbers, Ap¬ 
pendix 

ordinates of normal curve, 
Appendix 
Tabulation, 9 
Time series, 143 
Translation of axes, 35 
Trend, 143, 145 
Variance, 84 
Variates, 7 

Variability, see dispersion 
Weighted mean, 32 
Wilson, E. B., 214 









